Feature 3 – Batch File and Data

In order to run geocoding or reverse geocoding processes, any user should know which kind of data can be utilized at CSV2GEO. First of all, in the batch geocoding and reverse geocoding, fields the user can introduce values by manually typing or by copy-paste the address or geographic coordinates. However, there are situations when a large number of values should be introduced and in these situations having all the data into a single file is more convenient from time consuming perspective. Moreover, sometimes you may have files obtained in GIS software and in this case by only using few clicks you may get the coordinates that are necessary.

The file format that CSV2GEO tool accepts is a CSV file. A CSV file is the abbreviation from comma-separated values and it name mentions, the values in this kind of file are separated using a comma. Tabular data are stored in a CSV file as plain text, both numbers and text. Inside a CSV file each line having text or numbers is considered as a record and each record has many fields, these fields being separated by commas. It should be highlighted that besides commas, there are many other delimiters those separate values in a CSV file. Delimiters can include also semicolons, tabs, spaces, etc. The good news is that any type of CSV files is supported by the CSV2GEO tool. Consequently, users should not worry about the type of delimiter that is used in their CSV files while using them in the geocoding applications.

A csv file imported into csv2geo.com must be UTF-8 Compliant and not ASCII compliant. To understand the difference, we will demonstrate how ASCII format for csv file leads to wrong geocoding output.

For a start, if your file contain only ASCII letters, no need to worry about converting to UTF-8.

Below is an example of ASCII csv file loaded.

ascii symbols displayed inside the grid

As you can see ASCII cannot represent all non English characters correct and substitute them with random characters for what the standard thinks is a proper encoding.

If user process this file the outcome may be off by city level since the encoding is wrong at the city level. and that could be hundreds of miles difference.

To avoid such error, user must always use properly encoded UTF-8 csv files.

Example below shows how that file will look when loaded.

non UTF 8 symbols displayed inside the grid

There are few ways to prepare csv files that are UTF-8 compliant.

If the size of the file is small you can use Notepad.

  1. Load your csv file.
  2. Click save as. Make sure csv file extension is included.
  3. Select all file types.
  4. Go to Encoding. Select UTF-8.
  5. Click Save

If using Microsoft Excel

  1. Load your csv file
  2. Click save us. Enter some name
  3. Select from the drop down File type as CSV (Comma Delimited)
  4. Click Tools, then Web option
  5. Select Encoding Tab. Select UTF-8.
  6. Save

Set UTF8 file type

Another important aspect that should be mentioned is that at CSV2GEO are accepted both CSV files with quotation marks (“”) or without quotation marks, since there are files where values in the CSV file are between these quotation marks. Furthermore, users of CSV2GEO tool should know that there is a limit in size for the files those are uploaded and the file size limit is 100 Mb. However, it is important to emphasize that one of the advantages of CSV files is that these files consume less memory than other files and consequently a large amount of records is necessary to reach this limit of 100 Mb. Other advantages of a CSV file refer to the possibility to process this file with a large variety of applications (Microsoft Excel, Notepad etc.), it is easy to handle and to edit, and also it is easy to generate this kind of file.

Most users know that CSV files can be opened using Microsoft Excel software. As many other software and files used in the computational environment, CSV files have some different characteristics depending on the part of the world where these files are used. For example, the delimiter for CSV files in North America is the comma (‘), while in Europe this delimiter is semicolon (;). Even if this situation can be challenging sometimes and CSV files can be displayed with some errors due to the version of delimiter, it is important to know that anyone can change this delimiter. Nevertheless, this delimiter of CSV files can be changed in two different ways: directly into Microsoft Excel program and by changing the Regional Settings in your computer.

In order to change the delimiter in Regional Settings in your computer, the next steps should be followed: Control Panel Clock, Language and Region Change the date, time, or number format Click on Additional settings button.

OS Windows change region OS windows region change addtional settings reset delimiter

On the other hand, into Microsoft Excel software this delimiter can be changed by following these steps: Office Button (left hand corner) Excel Options Advanced Editing Options Uncheck 'Use system separators' define as per your needs.

resent option define system separators