Hi Alex,
Your suggestion works. However we are having a performance issue with larger documents.
What we are doing is taking a document and splitting every page apart and making the page its own document.
We want to make sure that each page retains its form fields.
We are doing this with the code below.
We call this method in a loop for each page in the document. The document is passed into the method and the page number in a list with one item.
We use this method to remove a range of pages at times also. For my issue we are splitting out every page and calling this method with the document and each page number. So for a 20 page document, this method is called 20 times and passed the document a the page number that is being split out.
We copy the document
Create a list of pages that are kept in the document. For my task it will be all the pages except for the passed in page number.
We then remove the form fields from the source document that belong to the page that we are splitting out, then the page that we are splitting out.
We then remove all of the form fields and pages from the copied document we are splitting out. This is a one page document so we are removing all the form fields and all the pages except for the one we are keeping.
This works except that with larger documents in can take up close to four minutes to do this with a 100 page document with around 100 form fields per page.
The biggest time delay is removing all of the fields and all the pages from the copied document that we are splitting out.
This is only a few seconds, but when you have one hundred page document, it adds up.
Is there a faster more effective way that I can do this?
Here is the method that we are using to split the document apart to separate pages.
We get page list and call this method in a loop for each page.
Any information on how to improve the performance would be appreciated.
Code: Select all
private static FileStream CopyPageToDocument(PdfDocument sourceDocument, List<int> extractedPages)
{
FileStream temporaryParentStream = null;
using (new StopwatchLog(string.Format($"Copying {extractedPages.Count} page(s).")))
{
// create document and copy from source.
using (PdfDocument pageDocument = new PdfDocument())
{
// Create copy command from the newly created page document.
PdfDocumentCopyCommand documentCopyCommand = new PdfDocumentCopyCommand(pageDocument)
{
CopyBookmarks = true,
CopyDocumentLevelJavaScripts = true,
CopyInteractiveForm = true
};
// Execute copy from the source document to the target document.
documentCopyCommand.Execute(sourceDocument);
// Create a list of pages that will be kept in the source document.
// The extractedPages list is 1 based. Keep both lists 1 based.
List<int> pagesToKeep = new List<int>();
for (int pageIndex = 1; pageIndex <= sourceDocument.Pages.Count(); pageIndex++)
{
if (extractedPages.Exists(item => item == pageIndex) == false)
{
pagesToKeep.Add(pageIndex);
}
}
// Remove the extracted pages from the source document.
extractedPages.Reverse();
foreach (int pageIndex in extractedPages)
{
// Check to make sure the InteractiveForm exist. There are scenarios where it is null
if (sourceDocument.InteractiveForm != null)
{
// Get a list of fields that belong to the pages that are not part of the new document and remove them.
PdfInteractiveFormField[] pageFields = sourceDocument.InteractiveForm.GetFieldsLocatedOnPage(sourceDocument.Pages[pageIndex - 1]);
foreach (PdfInteractiveFormField field in pageFields)
{
field.Remove();
}
}
// Remove unwanted pages.
sourceDocument.Pages.RemoveAt(pageIndex - 1);
}
// Remove the pages that are kept in the source document from the copied document.
pagesToKeep.Reverse();
foreach (int pageIndex in pagesToKeep)
{
// Check to make sure the InteractiveForm exist. There are scenarios where it is null
if (pageDocument.InteractiveForm != null)
{
// Get a list of fields that belong to the pages that are kept with the source document and remove them from the copied document.
PdfInteractiveFormField[] pageFields = pageDocument.InteractiveForm.GetFieldsLocatedOnPage(pageDocument.Pages[pageIndex - 1]);
foreach (PdfInteractiveFormField field in pageFields)
{
field.Remove();
}
}
// Remove unwanted pages.
pageDocument.Pages.RemoveAt(pageIndex - 1);
}
// Return the copied document as a file stream.
temporaryParentStream = new FileStream(FileUtilities.GetTemporaryFile(FileUtilities.TemporarySubFolders.FileProcessing), FileMode.CreateNew, FileAccess.ReadWrite, FileShare.ReadWrite, 4096, FileOptions.DeleteOnClose);
pageDocument.SaveChanges(temporaryParentStream);
}
}
return temporaryParentStream;
}