How to read Microsoft Excel document contents using C#/.NET?


Recently, we learned how to read text contents of Microsoft Word document from a .NET application using the APIs exposed in 'Microsoft.Office.Interop.Word.dll'. There may be a business requirement to read contents of Excel sheets too.

 

Today we are going to learn how to read Microsoft Excel documents. The complete source code has been shared for your easy reference.

 

How to read Microsoft Excel document contents using C#/.NET? (www.kunal-chowdhury.com)

 

First, you need to reference the 'Microsoft.Office.Interop.Excel.dll'. You can either get it from VSTO (Visual Studio Tools for Office) SDK, download from Microsoft server or install via NuGet.

 

Now, in your .NET app, create the instance of an Excel application. Then open the workbook, that you want to read, by calling the 'xlApp.Workbooks.Open' method as shown in the below code snippet. Next, you need to iterate through the sheets available in the workbook and find out the used range. It's the cells area where the data resides. Now iterate through this range for each sheet to find out the actual text content.

 

You would like to read:


How to read Microsoft Word document contents using C#/.NET?

How to read Microsoft Excel document contents using C#/.NET?

How to read Microsoft PowerPoint document contents using C#/.NET?


 

Here's the complete source code for you to use, but please make sure to properly release the COM objects at the place where it is mentioned:

 

   1:  public static string GetTextFromExcelDocument(string filePath)
   2:  {
   3:      var filePathAsString = filePath as string;
   4:      if (string.IsNullOrEmpty(filePathAsString))
   5:      {
   6:          throw new ArgumentNullException("filePath");
   7:      }
   8:   
   9:      if (!File.Exists(filePathAsString))
  10:      {
  11:          throw new FileNotFoundException("Could not find file", filePathAsString);
  12:      }
  13:   
  14:      var stringBuilder = new StringBuilder();
  15:      Excel.Application xlApp = new Excel.Application();
  16:      Excel.Workbook xlWorkbook = null;
  17:      Excel.Sheets xlWorksheets = null;
  18:   
  19:      try
  20:      {
  21:          xlWorkbook = xlApp.Workbooks.Open(filePath, 0, true, 5, "", "", true, 
  22:                                                  Excel.XlPlatform.xlWindows,
  23:                                                  "\t", false, false, 0, true,
  24:                                                  1, 0);
  25:          xlWorksheets = xlWorkbook.Sheets;
  26:   
  27:          if (xlWorksheets != null && xlWorksheets.Count > 0)
  28:          {
  29:              // read all worksheets of the excel document
  30:              foreach (Excel._Worksheet xlWorksheet in xlWorksheets)
  31:              {
  32:                  var xlUsedRange = xlWorksheet.UsedRange;
  33:                  if (xlUsedRange != null && xlUsedRange.Count > 0)
  34:                  {
  35:                      // read the valid range of cells
  36:                      foreach (Excel.Range xlRange in xlUsedRange)
  37:                      {
  38:                          if (xlRange.Value != null)
  39:                          {
  40:                              stringBuilder.Append(" " + xlRange.Value);
  41:                          }
  42:   
  43:                          ReleaseComObject(xlRange);
  44:                      }
  45:                  }
  46:   
  47:                  ReleaseComObject(xlUsedRange);
  48:                  ReleaseComObject(xlWorksheet);
  49:              }
  50:          }
  51:      }
  52:      catch (Exception ex)
  53:      {
  54:          // handle the exception, if any
  55:      }
  56:      finally
  57:      {
  58:          if (xlWorkbook != null) { xlWorkbook.Close(false); }
  59:          if (xlApp != null) { xlApp.Quit(); }
  60:   
  61:          ReleaseComObject(xlWorksheets);
  62:          ReleaseComObject(xlWorkbook);
  63:          ReleaseComObject(xlApp);
  64:      }
  65:   
  66:      return stringBuilder.ToString();
  67:  }

 

Was it helpful? Do let me know if you have any queries. Stay tuned for more updates.

 

 


If you have come this far, it means that you liked what you are reading. Why not reach little more and connect with me directly on Twitter, Facebook, Google+ and LinkedIn. I would love to hear your thoughts and opinions on my articles directly. Also, don't forget to share your views and/or feedback in the comment section below.

5 comments

  1. EPPlus is faster and easier; Aspose Cells is faster, easier, and more full-featured than EPPlus. However, Aspose Cells costs $, so YMMV.

    ReplyDelete
  2. This kind of reading/writing Excel documents has some disadvantages:
    1.) You must have MS Office installed (which might mean that you have to buy an additional license)
    2.) If you don't handle your exceptions right, several invisible Excel instances are still running
    3.) You have to do memory management (releasing COM objects)
    4.) It's slow

    Every single point from above can be eliminated with doing one thing different: Don't work with Excel itself -> work with the document. There is an Open Document Standard so there are several libraries that allow working directly with the document, e.g. Open XML SDK (https://github.com/OfficeDev/Open-XML-SDK). It's a lot (!!!) faster and you are not forced to have Excel installed.

    ReplyDelete
    Replies
    1. Yes Michael, you are right. It's sometimes pretty slow when working with a large collection of cells. Also, handling the COM and in-process Excel app was a nightmare for me initially.

      Thanks for sharing the other link. Do you know, whether that supports binary format, BTW?

      Delete
  3. This is a good fit for desktop automation where you'll already have office installed on the local machine. If you're building a server-side application I'd strongly recommend looking at the open source library called NPOI which handled both the old binary formats as well as the newer XML formats.

    ReplyDelete

 
© 2008-2017 Kunal-Chowdhury.com - Microsoft Technology Blog for developers and consumers | Designed by Kunal Chowdhury
Back to top