Created
September 1, 2011 03:47
-
-
Save nking/1185410 to your computer and use it in GitHub Desktop.
appengine custom transforms
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Notes on exporting and importing data to appengine. | |
Schema refactoring or complex relationships between entities may | |
require custom transforms to be written and added to the config yaml | |
file that the bulkloader generates automatically. | |
A few helpful resources on that are | |
http://wereword-gae.googlecode.com/hg/backup/helpers.py | |
http://bulkloadersample.appspot.com/ | |
http://longsystemit.com/javablog/?p=23 | |
and Google I/O 2011: | |
http://bulkloadersample.appspot.com/showfile/bulkloader-presentation.pdf | |
and for pointers using the bulkloader default config.yaml | |
http://ikaisays.com/2010/06/10/using-the-bulkloader-with-java-app-engine/ | |
(1) Collections of items that are stored in the entity (implicity as embedded classes) | |
such as a list of strings need custom transforms: | |
In the config yaml file: | |
- property: parameterNames | |
external_name: parameterNames | |
import_transform: transform.split_string('|') | |
export_transform: additionaltransformers.export_parameternames_to_string | |
And in the imported additionaltransformers.py: | |
def export_parameternames_to_string(value, bulkload_state): | |
parameters = bulkload_state.current_instance['parameterNames'] | |
if parameters is None: | |
return '' | |
return "|".join(["%s" % (k) for k in parameters]) | |
(2) A change of ancestor keys to refactor membership in entity groups or | |
for migration may need the key components separated on export and | |
reconstructed differently on an import. | |
Note that the changes should work with the appengine's big table | |
rules of entity groups and ancestors. | |
In this case, I've created a new column for the export to be used during a | |
subsequent import. The new column is parentKey and that's generated here. | |
In the config yaml file: | |
- kind: Event | |
connector: csv | |
connector_options: | |
encoding: utf-8 | |
columns: from_header | |
import_options: | |
dialect: excel-tab | |
export_options: | |
dialect: excel-tab | |
property_map: | |
- property: __key__ | |
external_name: key | |
export: | |
- external_name: parentKey | |
export_transform: | |
additionaltransformers.create_pseduo_parent_event_key_string | |
- external_name: key | |
export_transform: transform.key_id_or_name_as_string_n(0) | |
import_transform: additionaltransformers.create_event_key | |
And in the imported additionaltransformers.py: | |
def create_event_key(value, bulkload_state): | |
pn = bulkload_state.current_dictionary['parentKey'] | |
kn = bulkload_state.current_dictionary['key'] | |
kpath = ['Event', pn] + ['Event', kn] | |
return datastore.Key.from_path(*kpath) | |
(3) Exploring the creation of DAGs? Because we can create keys with common | |
ancestors and hence co-membership in an entity group and because we can create | |
the one-to-many IDX fields in the "many" entity, we should be able to create | |
entities that exist in more than one collection of another entity in our tsv files | |
and then in the datastore via upload. | |
The DAG relationships are already possible as unowned relationships in appengine, | |
that is, as collections of keys. | |
But if it's possible to use the convenient auto-fetching within a transaction | |
to have the full entities available from a fetch (and force fetches via touches | |
within the transaction to get as much of the DAG as needed), this might be useful. | |
Caveat is that even if the import of the DAG with "owned" relationships | |
succeeds, there may be trouble during transactional updates...haven't | |
tried this... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment