2012-08-13

easily open csv files as all text fields with vbs send-to

when excel opens csv files, it will try to infer the data types of the cells based on the data in the file.  however, sometimes it gets it wrong. for example with zip codes, it will think they are plain numbers and remove leading 0's.

to open a csv file as text in excel 2007 or 2010, you can click on the data tab and click from text.  the text import wizard will let you click on each column and specify that it should be a text column so excel won't mess with the data.

it's clumsy to use the text import wizard every time you want to open a csv file with all the columns read as text.  my solution was to add an item to the "send-to" menu.  whenever you right click on a file in explorer, there's a "send to" menu, and you can add custom items to this menu.  i added a script that imports the file into excel.



here's how to set it up:

  1. open notepad
  2. paste the script at the bottom
  3. click file / save as...
  4. for the file name, type %userprofile%\AppData\Roaming\Microsoft\Windows\SendTo\openCsvAsText.vbs
  5. click save

now to use it, right click on a csv file, and click "send to" and then click openCsvAsText.vbs

that's it!

note:  the script actually opens the file twice.  first, it uses OpenText to open the csv file like normal.  then, the contents of the sheet is cleared (but the file is not saved).  then it uses the text import wizard to import the same data, but with different parsing options.  this is necessary because neither function does everything:  first, OpenText ignores the FieldInfo parameter when the file extension ends in "csv" -- it only listens when it's a "txt" file.  second, if you just use the text import wizard, then when you go to save the file, it will ask you where to save it like it's a new workbook.  this is because you'd literally be creating a new workbook and importing some data, which is not the same as opening the data directly.  so, we open the data directly, then import on top with the correct settings.

i would love to hear if this helped you.

here's the script:



Option Explicit

Const xlTextFormat = 2
Const xlTextQualifierDoubleQuote = 1
Const xlDelimited = 1
Const xlOverwriteCells = 0


'when using the send-to menu, windows explorer passes the filename of the right-clicked file
'as a command line argument to the application used to open the file in the send-to folder,
'in this case "Microsoft Windows Based Script Host".  this is the program that runs .vbs files.
'this program makes the filename command line argument available to this script via
'WScript.Arguments.Ietm(0).

Dim filename
filename = WScript.Arguments.Item(0)


'we need specify each column individually.  this says that the first 500 (0-499)
'columns should be read as text.  we will use fieldInfo later.

Dim fieldInfo(499)
Dim i
For i = 0 To UBound(fieldInfo)
    fieldInfo(i) = xlTextFormat
Next


'start excel, with no screen updating because it would flicker around
'while opening the file the second time (we read from the file twice
'due to limitations of excel).

Dim objExcel
Set objExcel = CreateObject("Excel.Application")
objExcel.Visible = True
'objExcel.ScreenUpdating = False 'optional

'open the file, so when we save it later, it knows where to save

Dim objWorkbook
objExcel.Workbooks.OpenText(filename) '.Add
Set objWorkbook = objExcel.ActiveWorkbook


'clear the contents of the file, because we opened it with all the default (wrong) settings

Dim objWorksheet
Set objWorksheet = objWorkbook.Worksheets(1)
objWorksheet.UsedRange.ClearContents

'set format of *all* cells to text, so when we import, we don't have
'the imported cells as "text" but the rest of them as "general".
objWorksheet.Cells.NumberFormat = "@"

'do the equivalent of using the text import wizard, but change some options:
'-RefreshStyle: overwrite cells instead of making room
'-PreserveFormatting: got rid of it
'-AdjustColumnWidth: set to false

Dim m
Set m = objWorksheet.QueryTables.Add("TEXT;" & filename, objWorksheet.Range("$A$1"))
m.FieldNames = True
m.RowNumbers = False
m.FillAdjacentFormulas = False
m.RefreshOnFileOpen = False
m.RefreshStyle = xlOverwriteCells
m.SavePassword = False
m.SaveData = True
m.AdjustColumnWidth = False
m.RefreshPeriod = 0
m.TextFilePromptOnRefresh = False
m.TextFilePlatform = 65001 'UTF-8. for "OEM United States" use 437
m.TextFileStartRow = 1
m.TextFileParseType = xlDelimited
m.TextFileTextQualifier = xlTextQualifierDoubleQuote
m.TextFileConsecutiveDelimiter = False
m.TextFileTabDelimiter = False
m.TextFileSemicolonDelimiter = False
m.TextFileCommaDelimiter = True
m.TextFileSpaceDelimiter = False
m.TextFileColumnDataTypes = fieldInfo
m.TextFileTrailingMinusNumbers = True
m.Refresh False


'the contents now should be the contents already in the file,
'so marked as saved so it does not prompt to save if no
'changes are made.

objWorkbook.Saved = True


'done.

objExcel.ScreenUpdating = True

10 comments:

  1. Its just superb guiding blog in my opinion....thank you very much for this info.

    ReplyDelete
  2. Great, but how can I open file in UTF-8?

    ReplyDelete
    Replies
    1. good idea, thanks for asking.

      change TextFilePlatform to 65001:

      m.TextFilePlatform = 65001 'UTF-8

      if you want to apply any other customizations, just get the developer tab to show, start recording a macro, go to the data tab and click "from text" in the "get external data" section. do your thing, then stop recording the macro, open visual basic and look under Module1 to see the macro. look for the line that has your customization and change my script accordingly.

      Delete
  3. Thank you very much!

    This will save a lot of time and trouble. I really can not understand why this is not (AFAIK) built in feature in excel. Working with MsSql CSV exports and Excel is a pain without this.

    So for my needs I needed to change two lines:
    m.TextFileSemicolonDelimiter = False ----> True
    and
    m.TextFileCommaDelimiter = True ----> False

    As in SQL exports are most often by default separated with semicolon (;) rather than comma (,).

    Lauri J.
    Finland

    ReplyDelete
  4. It worked for me from .Net as well.

    ReplyDelete
  5. Fantastic, thankyou.

    Just one extra, this just formats the cells for the imported data as text. How would I get all the cells on the sheet as text. this would help me when adding new data before saving, without having to manually format.

    David
    New Zealand

    ReplyDelete
    Replies
    1. Solved !!!
      Added .Cells.NumberFormat = "@" as follows:

      'clear the contents of the file, because we opened it with all the default (wrong) settings
      'format the entire sheet as Text to allow editing and adding without format issues.

      Dim objWorksheet
      Set objWorksheet = objWorkbook.Worksheets(1)
      With objWorksheet
      .UsedRange.ClearContents
      .Cells.NumberFormat = "@"
      End with


      David

      Delete
    2. good idea, david. thank you for posting your solution!

      Delete
  6. Excellent post. Couple of questions.

    1. What would I need to do to run it on a machine that does not have office installed.

    2. Instead of leaving the excel file open, what needs to be added to the script to close it and maybe save it as a .xls(x) file

    ReplyDelete
    Replies
    1. 1) you can use the office interop libraries to manipulate office documents without office installed and running. you could convert the above visual basic script into vb.net code and in your vb.net program, use the libraries. see here: http://www.microsoft.com/en-us/download/details.aspx?id=3508
      2) use the objWorkbook variable in the code above and call SaveAs. see here: http://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.workbook.saveas%28v=vs.80%29.aspx

      Delete