Skip to main content

Document Layout Analysis




Document Layout Analysis is our second exercise. Using the three images above our program needs to do the following:
  1. Individual characters are boxed
  2. Individual words are boxed
  3. Lines are boxed
  4. Paragraphs are boxed
  5. The paragraphs with margins

I used a bottom-up approach for this exercise. It means that I started detecting and boxing the letters to words to line to paragraph and lastly to the paragraph with margin. I created a function for each of the objectives. I used a trial and error approach for determining the appropriate kernel size for the specific function. I have a very simple step for every objectives:

  1. Load the images.
  2. Assigning of output images
  3. Convert images to grayscale
  4. Cleaning the images using Otsu's Thresholding method. (with the inversed binarized image)
  5. Assigning kernel size (1 or 2 kernels depending in the objective)
  6. Morphological Operations (Dilation, Erosion, Closing and Opening)
  7. Find the Contours
  8. Box the contours (I added some offset in the word and letter objectives because the morphological operation I used is affecting the position to be bounded by the rectangle)
  9. Writing the image to a file.
Doing the exercise, I have doubts about my algorithm because it seems very simple and not that dynamic so I just finished boxing letters, words and line for the first image (with a very dirty code) but when I heard my classmates that they also did the same I began to code the other objective I lacked.

Implementing my algorithm, big fonts and colored images are the big limitation of the program. For example, heading letters will be detected as words for the word function because of its size. (Remember that I am using a hardcoded kernel size for each of my functions.) I can only guarantee a high accuracy detection for the example images.

Here are the resulting images for example 1:

Letters

Words

Lines

Paragraphs

Paragraphs with Margin

Comments

Popular posts from this blog

Installing AsgardCMS for your Web Application

AsgardCMS is a full-featured modular and multilingual CMS on top of the Laravel Framework. Here are the steps for installing the aforementioned CMS. You can get the code using this command: composer create-project asgardcms/platform your-project-name If the terminal ask you for a token. Just follow the steps of generating a new token here: https://help.github.com/articles/creating-an-access-token-for-command-line-use/  After that, the installation must be smooth-sailing. Go t the directory of your project php artisan asgard:install Then, you will now set-up the database connection and admin creation. Finally, you can run  php artisan serve or php artisan serve --port=your-port Access the application: Application : localhost:your-port/en Admin: localhost:your-port/en/backend References:  https://asgardcms.com/install https://www.youtube.com/watch?v=MeX_D-aql6g http://asgardcms.blogspot.in/2015/12/asgardcms-inst...

UX Research: Understanding User Needs and Behaviors

As someone who has recently started a UX course on Coursera, I have learned that UX research is a critical aspect of the design process. The purpose of UX research is to gain a deep understanding of the users, their needs, and behaviors. This information is used to inform design decisions and create products and services that meet the needs and expectations of the users. There are several methods that can be used for UX research, including: Surveys: Surveys are a quick and easy way to gather information from a large number of users. They can be administered online or in person and can be used to gather information about demographics, user behavior, and product or service usage. User interviews: User interviews are one-on-one conversations with users. They are an effective way to gather in-depth information about a user's experiences, thoughts, and opinions. User interviews can be conducted in person or over the phone and can last anywhere from 30 minutes to an hour. User testing: U...