How to read Microsoft PowerPoint document contents using C-Sharp/.NET?

In this article we will learn how to read Microsoft PowerPoint document contents using C#/.NET. - Article by Kunal Chowdhury on


Recently we learnt how to read Microsoft Word and Microsoft Excel document contents (text only) using the interop APIs exposed by Microsoft. Now, what about reading the text content from PowerPoint slides? This can be achievable using another interop assembly file.

 

Today we will discuss how to extract the texts available in PPT files using 'Microsoft.Office.Interop.PowerPoint.dll'. Code has been shared for your easy reference.

 

How to read Microsoft PowerPoint document contents using C#/.NET? (www.kunal-chowdhury.com)

 

First, you need to create the instance of PowerPoint application. Then open the presentation, that you want to read, by calling the 'pptPresentations.Open' method as shown in the below code snippet. Next, you need to iterate through the slides available in the presentation file and find out the shapes where TextFrame is available. TextFrame holds the text content of each slides. Now retrieve the TextRange out of the TextFrame to extract the text content.

 

 

Here's the complete source code for you to use, but please make sure to properly release the COM objects at the place where it is mentioned:

 

  public static string GetTextFromPowerPoint(string filePath)
  {
      if (string.IsNullOrEmpty(filePath))
      {
          throw new ArgumentNullException("filePath");
      }
   
      if (!File.Exists(filePath))
      {
          throw new FileNotFoundException("Could not find file", filePath);
      }
   
      var stringBuilder = new StringBuilder();
   
      try
      {
          PowerPoint.Application pptApp = new PowerPoint.Application();
          PowerPoint.Presentations pptPresentations = pptApp.Presentations;
          PowerPoint.Presentation pptPresentation = pptPresentations.Open(filePath,
                                   MsoTriState.msoTrue, MsoTriState.msoFalse, MsoTriState.msoFalse);
          PowerPoint.Slides pptSlides = pptPresentation.Slides;
   
          if (pptSlides != null)
          {
              var slidesCount = pptSlides.Count;
              if (slidesCount > 0)
              {
                  for (int slideIndex = 1; slideIndex <= slidesCount; slideIndex++)
                  {
                      var slide = pptSlides[slideIndex];
                      foreach (PowerPoint.Shape textShape in slide.Shapes)
                      {
                          if (textShape.HasTextFrame == MsoTriState.msoTrue && 
                                   textShape.TextFrame.HasText == MsoTriState.msoTrue)
                          {
                              PowerPoint.TextRange pptTextRange = textShape.TextFrame.TextRange;
                              if (pptTextRange != null && pptTextRange.Length > 0)
                              {
                                  stringBuilder.Append(" " + pptTextRange.Text);
                                  ReleaseComObject(pptTextRange);
                              }
                          }
   
                          ReleaseComObject(textShape);
                      }
   
                      ReleaseComObject(slide);
                  }
              }
          }
   
          ReleaseComObject(pptSlides);
          ReleaseComObject(pptPresentation);
          ReleaseComObject(pptPresentations);
          ReleaseComObject(pptApp);
      }
      catch (Exception ex)
      {
          // handle exceptions, if any
      }
   
      return stringBuilder.ToString();
  }

 

Was it helpful? Do let me know if you have any queries. Stay tuned for more articles and subscribe to my feed to get all the latest updates.

 





9to6linux.com | Covering latest news, articles, Tips and Tricks on Linux platform


Latest Tech News