Reading Word Documents in WSH

Written by Nilpo in Windows Scripting | 8 Comments

“How can I read the contents of a Microsoft Word document using VBScript in WSH?” – Bill S.

Reading the contents of a Microsoft Word document in WSH is actually pretty simple. You can use any language you like with this method, but I’ll be using VBScript.

It only seems logical that the best way to read a Word document is to use the Word application. So that’s exactly what we’re going to do. With that being said, your script should start by defining a few necessary variables.

Const wdDoNotSaveChanges = 0
strDocument = "C:\mydoc.docx"

The most important variable here is the path to the Word document that you wish to read. This should be the full path. It can be any valid local or UNC path. The wdDoNotSaveChanges constant will be used to suppress the save changes dialog later.

Set objWord = CreateObject("Word.Application")
objWord.Visible = False
objWord.DisplayAlerts = False
objWord.Documents.Open strDocument,, True
Set objDoc = objWord.ActiveDocument

VBScript’s CreateObject method is used to create an instance of the Word automation object. This creates an instance in memory of the Word application. Setting the Word automation object’s Visible property to False will load the application in a hidden state. The DisplayAlerts property is then used to suppress any dialog boxes. Finally, the Word document is opened and activated.

Set objRange = objDoc.Content
strContents = objWord.CleanString(objRange.Text)

Once the document is open, you must create a range containing the contents of the document. This can be done quite easily using the Document object’s Content property. Once you have range, the Text property is used to return the text contained within that range. Additionally, I’ve used the Word object’s CleanString method to remove any Word formatting from the text string.

objWord.Quit wdDoNotSaveChanges
 
MsgBox strContents

As you can see, reading from a Word document in WSH is accomplished very easily using the Word automation object.

Be careful when using this method. Remember that you are reading the entire contents of the file. Larger files may exceed the allocated memory for your variable and cause problems. In these cases, you would be better to move through paragraph objects in your document. I’ll demonstrate that in a later article.

Tags

Like the read? Share it!

8 Comments

  • I like the blog, but could not find how to subscribe to receive the updates by email. Can you please let me know?

  • Hi Hugo,

    You can click the Subscribe to Posts link in the sidebar (http://www.nilpo.com/subscribe/) or you can subscribe to the RSS feed (http://www.nilpo.com/feed/).

  • Good info!

    One thing, when i execute the objWord.Quit wdDoNotSaveChanges, i’m getting a prompt for Save As…any idea why that would be prompting me to do a Save As?

    I’m running this from ASP.Net, would that make a difference?

    nme = path.Substring(path.LastIndexOf(“\”) + 1)
    nme = nme.Substring(0, nme.LastIndexOf(“.”)) + “.rtf”
    wApp.Visible = False
    wDoc = wApp.Documents.Open(path, , False)
    wDoc.Activate()

    nme = path.Substring(0, path.LastIndexOf(“\”) + 1) + nme
    Dim type As Object = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatRTF
    wDoc.SaveAs(nme, type)
    Dim saveChanges As Object = Microsoft.Office.Interop.Word.WdSaveOptions.wdSaveChanges
    wDoc.Close(saveChanges)
    wApp.Quit()

    Thanks for your consideration.

  • Hi Jay,

    I don’t see where you’re using the wdDoNotSaveChanges constant in the example you’ve provided. It looks as though you’re just using the Quit method without any parameters.

  • Sorry,

    In the post, the Dim saveChanges has the wdSaveChanges instead of the wdDoNotSaveChanges because i was still testing

    also, i close the word document, your objDoc with the wdDoNotSaveChanges, thus should be able to just quit the application without having to reference the wdDoNotSaveChanges again…

    However, for testing purposes, I have included both, as you will notice below

    but whether I have saveChanges = to wdSaveChanges or wdDoNotSaveChanges, it still gives me the prompt…

    for the sake of it, i even iterate through the documents within the app, just in case there’s more than one and auto save the file before closing the doc with the saveChanges variable…which is set to wdDoNotSaveChanges

    nme = path.Substring(path.LastIndexOf(“\”) + 1)
    nme = nme.Substring(0, nme.LastIndexOf(“.”)) + “.rtf”
    wApp.Visible = False
    wDoc = wApp.Documents.Open(path, , True)
    wDoc.Activate()

    nme = path.Substring(0, path.LastIndexOf(“\”) + 1) + nme
    Dim type As Object = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatRTF
    wDoc.SaveAs(nme, Type)

    Dim saveChanges As Object = Microsoft.Office.Interop.Word.WdSaveOptions.wdDoNotSaveChanges
    For Each doc As Microsoft.Office.Interop.Word.Document In wApp.Documents
    doc.Save()
    doc.Close(saveChanges)
    Next
    wApp.Quit(saveChanges)

  • It works!

    i tested another word app, if it doesn’t have complex macros, then the word doc will close without any prompts. If it has complicated macros embedded, then it won’t.

    since 99% of the word files i’ll be reading will have narrative, i’m good.

    thank you again for your time and consideration!

  • Jay,

    It sounds like you might be using Word 2007. It uses a separate file format for macro-enabled documents. In your code, you specify the Save As type as RTF (which doesn’t support macros) so when Word closes, it still wants to save the open macro-enabled document. Try using:

    Dim type As Object = Microsoft.Office.Interop.Word.WdSaveFormat.wdFormatXMLDocumentMacroEnabled

  • I will give that a try.

    thanks again!

Leave a Reply

Contact

Wanna say hello?
Drop us a line!

You'll find us here

1 Microsoft Way,
Redmond,
WA 98052, United States