In building the Spatialite geodatabase (see previous post), one of the fundamental things I learned was how to manage table creation and alteration in SQLite, which was quite different from my previous experience working with MS Access and ArcGIS.
In SQLite, the ALTER TABLE statement is limited to changing the name of a table or adding new columns. If you want to make any other changes, the process is: create a new, blank table that’s structured the way you want, and then write an INSERT statement to copy the data you want from the existing table into the new table. So if you have a table that has a bunch of columns you want to drop or change, create a new table that has your desired structure, then insert what you want into that new table. The same thing goes for primary keys or other constraints. If your existing table doesn’t have a specified key you can’t alter it by specifying one: create a new table that does, then copy your data over.
Let’s say we have this table (de_data) with basic population data for Delaware’s three counties:
USPS GEOID NAME POP10 HU10 ALAND_SQMI AWATER_SQMI DE 10001 Kent County 162310 65338 586.179 212.152 DE 10003 New Castle County 538479 217511 426.286 67.717 DE 10005 Sussex County 197145 123036 936.079 260.312
And let’s say that we want to change some of the column names, drop the field for housing units, and specify our data types and GEOID as our key. First, create the table:
CREATE TABLE de_pop (STATE TEXT, GEOID TEXT NOT NULL PRIMARY KEY,
COUNTY TEXT, POP10 INTEGER, ALAND REAL, AWATER REAL)
GEOID is the FIPS/ANSI code that uniquely identifies each state. Since these codes may have leading zeros (the codes for all states from AL through CT do), we designate it as text.
Second, insert the data we want from the existing table:
INSERT INTO de_pop (STATE, GEOID, COUNTY, POP10, ALAND, AWATER)
SELECT USPS, GEOID, NAME, POP10, ALAND_SQMI, AWATER_SQMI
FROM de_data
Order here matters – it’s going to insert columns from the original table into the new one in sequence: USPS into STATE, GEOID into GEOID, etc. Sometimes it’s possible to use an * as a shortcut to insert and copy everything, instead of listing every field, if both tables contain the same number of columns and they’re in the right order. But this is always a bit risky.
Lastly, if we don’t need that original table we could delete it:
DROP TABLE de_data
SELECT * FROM de_pop
STATE GEOID COUNTY POP10 ALAND AWATER DE 10001 Kent County 162310 586.179 212.152 DE 10003 New Castle County 538479 426.286 67.717 DE 10005 Sussex County 197145 936.079 260.312
The process is similar if we want to take a query or view and turn it into a permanent table. SQLite does support CREATE TABLE AS, followed by a select query, so you can create a table out of a query. However – this is usually NOT the best course of action. If you do this, you won’t be able to specify a primary key for the new table (you really never want a table that lacks a key). Furthermore, if you create a new, calculated field you won’t be able to specify a type for it. This is particularly a problem if you’re working in Spatialite and want to create a spatial view or table that you want to join to some features and map in QGIS. If no type is specified, QGIS won’t know how to handle that new field, and it will be unviewable.
So, we take the same approach as before. Let’s say we want to create a table with population density as a calculated field. First, create the table:
CREATE TABLE de_popdens (STATE TEXT, GEOID TEXT NOT NULL PRIMARY KEY,
COUNTY TEXT, POP10 INTEGER, ALAND REAL, AWATER REAL, POPDENS REAL)
Then we write our insert statement, and in the insert we create the calculated field:
INSERT INTO de_popdens (STATE, GEOID, COUNTY, POP10, ALAND, AWATER, POPDENS)
SELECT STATE, GEOID, COUNTY, POP10, ALAND, AWATER, (ROUND(POP10/ALAND),1)
FROM de_pop
SELECT * FROM de_density:
STATE GEOID COUNTY POP10 ALAND AWATER POPDENS DE 10001 Kent County 162310 586.179 212.152 276.9 DE 10003 New Castle County 538479 426.286 67.717 1263.2 DE 10005 Sussex County 197145 936.079 260.312 210.6
This gives us a new table with density properly typed. Once again, this is only necessary if you want the calculated field to be permanent and function with other objects in the database, and particularly if you want to join this table to a geodatabase feature or shapefile to map the attributes. If you simply want the new field to answer a specific question or to export the data to output, you can just do a SELECT query and save it as a view.
I also had to do this procedure for every table and shapefile that I imported to the geodatabase. In the Spatialite GUI you don’t have the option to specify a primary key or data types for columns when you do an import; for the latter it takes its best guess. So to get it well-structured, I imported (or created a virtual link), created a new table, inserted the data over, then deleted the original table or severed the link. If it was a shapefile I went through the extra step of activating the geometry (check out the Spatialite Cookbook or the tutorial I wrote for the NYC geodatabase for details).
Since I was dealing with some enormous tables with hundreds of columns, I used some trickery to avoid typing all the statements by hand. If the data was small enough and came from a spreadsheet, I used a series of concatenate formulas to build the CREATE TABLE and INSERT statements by copying the field names and stringing them together with type names and necessary syntax, so I could just copy and paste a statement into the SQL dialog box. For larger datasets I used Python to do the processing, and had Python grab field names and write statements that I could copy and paste.
The import issue was particular to the Spatialite GUI, and not SQLite in general. If you’re dealing with just data tables and use the SQLite Manager (Firefox plugin), it asks you to specify column names, keys and constraints, and types for the columns you’re importing. It does the latter by having you select from a dropdown box for each column – this works fine if you have 10 or 20 columns, but it’s rather tedious if you have hundreds. The Manager also gives you the ability to alter additional elements of the table, like column names, by essentially performing the same operations (create table, insert records, drop table) behind the scenes while holding things temporarily in memory. It prefaces this with a warning message that this operation is not part of the standard SQLite commands, and there’s a chance something could go awry.