摘要 |
<p>The present invention relates to a method for recognizing a space in a PDF file, the method comprising: step 1, traversing a PDF file, and recording the distance between each pair of adjacent characters; step 2, determining the minimum value h of the distance between each pair of adjacent characters; step 3, subtracting h from the distance between each pair of adjacent characters to obtain a relative distance between each pair of adjacent characters; and step 4, sequentially determining whether the relative distance between each pair of adjacent characters is less than a preset space width, and if yes, then the gap between the pair of adjacent characters is not a space, otherwise the gap between the pair of adjacent characters contains a space. The present invention improves the accuracy of determining whether a space exists between adjacent characters.</p> |