Get Text font

Questions, comments and suggestions concerning VintaSoft Imaging .NET SDK.

Moderator: Alex

Post Reply
David_karlsson
Posts: 8
Joined: Mon Jan 14, 2019 12:55 pm

Get Text font

Post by David_karlsson »

Hi !

I'm trying to replace text on PDF.
First I used https://www.vintasoft.com/docs/vsimagin ... _Page.html to remove text.
Then I want to use DrawString https://www.vintasoft.com/docs/vsimagin ... tring.html to write new text.
The new text should have same font as deleted text. Is it possible to find text font before deleting ? How can I accomplish this task ?
Alex
Site Admin
Posts: 2305
Joined: Thu Jul 10, 2008 2:21 pm

Re: Get Text font

Post by Alex »

Hi David,
David_karlsson wrote: Tue Jan 15, 2019 2:58 pm Is it possible to find text font before deleting ?
Yes, you can get text font before deleting the text.

For getting PDF font before deleting text you need do the following steps: Best regards, Alexander
David_karlsson
Posts: 8
Joined: Mon Jan 14, 2019 12:55 pm

Re: Get Text font

Post by David_karlsson »

Hi Alex.
I followed your instructions and it worked well on regular PDF files.
But when I use image PDF with overlay text it didn't work well.


Test pdf:
https://drive.google.com/open?id=1VRc06 ... xiG2NSrvw6

Result PDF:
https://drive.google.com/open?id=1dud5A ... Naysy6kXiE

The text is hiding behind a black rectangle.. See result.

Code I used:

Code: Select all

        public static void TestFindAndRemoveTextOnAllPages( string inputPdfFilename, string outputPdfFilename, params string[] textToRemove)
        {
            // open document
            using (Vintasoft.Imaging.Pdf.PdfDocument document = new Vintasoft.Imaging.Pdf.PdfDocument(inputPdfFilename))
            {
                // if there is a text to remove
                if (textToRemove.Length > 0)
                {
                    // create a list that contains text regions to remove
                    System.Collections.Generic.List<Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion> textRegions =
                        new System.Collections.Generic.List<Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion>();

                    // for each page
                    foreach (Vintasoft.Imaging.Pdf.Tree.PdfPage page in document.Pages)
                    {
                        // clear a list of text regions to remove
                        textRegions.Clear();

                        // for all text strings that must be remove
                        for (int i = 0; i < textToRemove.Length; i++)
                        {
                            // search text string on PDF page
                            Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion[] searchedText = SimpleTextSearchOnPdfPage(page, textToRemove[i]);
                            // if text is found
                            if (searchedText != null && searchedText.Length > 0)
                                // add searched text to a list of text for removing
                                textRegions.AddRange(searchedText);
                        }

                        // if PDF page contains text regions with text to remove
                        if (textRegions.Count > 0)
                        {                            
                            // remove text regions from PDF page
                            page.RemoveText(textRegions.ToArray());

                            foreach (PdfTextRegion textRegion in textRegions)
                            {
                                using (PdfGraphics graphics = page.GetGraphics())
                                {
                                    var symbol = textRegion.Symbols.First();
                                    var font = symbol.TextSymbol.Font;
                                    var fontSize = symbol.FontSize;
                                    var fontColor = symbol.Color;
                                    var brush = new PdfBrush(fontColor);
                                    graphics.DrawString("New text", font, fontSize, brush, textRegion.Rectangle.Location);


                                }
                            }

                        }

                    }
                }

                // if names of source and destination files are the same
                if (inputPdfFilename == outputPdfFilename)
                    // pack PDF document
                    document.Pack();
                // if names of source and destination files are different
                else
                    // pack source PDF document to specified file
                    document.Pack(outputPdfFilename);
            }
        }


        /// <summary>
        /// Searches a text string on PDF page.
        /// </summary>
        /// <param name="page">PDF page where text should be searched.</param>
        /// <param name="text">Text to search.</param>
        /// <returns>An array of text regions on PDF page where text was found.</returns>
        public static Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion[] SimpleTextSearchOnPdfPage(
            Vintasoft.Imaging.Pdf.Tree.PdfPage page, string text)
        {
            System.Collections.Generic.List<Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion> textRegions =
                new System.Collections.Generic.List<Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion>();

            Vintasoft.Imaging.Pdf.Content.TextExtraction.PdfTextRegion textRegion = null;
            int startIndex = 0;
            do
            {
                // search text
                textRegion = page.TextRegion.FindText(text, ref startIndex, false);
                // if text is found
                if (textRegion != null)
                {
                    // add searched text to a result
                    textRegions.Add(textRegion);
                    // shift start index
                    startIndex += textRegion.TextContent.Length;
                }
            } while (textRegion != null);

            return textRegions.ToArray();
        }



Alex
Site Admin
Posts: 2305
Joined: Thu Jul 10, 2008 2:21 pm

Re: Get Text font

Post by Alex »

Hi David,
David_karlsson wrote: Tue Jan 15, 2019 6:46 pm I followed your instructions and it worked well on regular PDF files.
But when I use image PDF with overlay text it didn't work well.
...
The text is hiding behind a black rectangle.. See result.
Thank you for information. For understanding your problem and providing the best solution we need to reproduce your problem on our side. Please send us (to support@vintasoft.com) a working console application, which allows to convert "test.pdf" file into "result.pdf" file.

Best regards, Alexander
Post Reply