Difference between revisions of "Data Migration"
Line 136: | Line 136: | ||
:: Bananas Ltd\ | :: Bananas Ltd\ | ||
::: org_mandate.docx | ::: org_mandate.docx | ||
− | An appropriate mapping file | + | An appropriate mapping file is shown below: |
+ | |||
+ | The unique identifier column being a reference to the unique identifier of each record and also include the field id of where the files will be uploaded into if it will be uploaded into different fields. | ||
{| class="wikitable" style="height: 134px;" width="534" | {| class="wikitable" style="height: 134px;" width="534" | ||
Line 203: | Line 205: | ||
===[[Custom_Field_Type:_Special_-_Advanced_Data_Table|Advanced Data Table]]=== | ===[[Custom_Field_Type:_Special_-_Advanced_Data_Table|Advanced Data Table]]=== | ||
<!--Ticket# 91363 - Imported L2 activities that are mapped to L1 XML fields are not displayed until the XML field is manually saved--> | <!--Ticket# 91363 - Imported L2 activities that are mapped to L1 XML fields are not displayed until the XML field is manually saved--> | ||
− | |||
Limitations when importing L2 records that will be displayed in L1 data table: | Limitations when importing L2 records that will be displayed in L1 data table: | ||
* The imported Advanced Data Table records are not displayed until the table is manually saved one time | * The imported Advanced Data Table records are not displayed until the table is manually saved one time | ||
* The syntax to display the Advanced Data Table records must be hardcoded and revised if the dependent XML field is ever changed versus using the @XML reference that will automatically include any changes | * The syntax to display the Advanced Data Table records must be hardcoded and revised if the dependent XML field is ever changed versus using the @XML reference that will automatically include any changes | ||
− | |||
[[Category:Data Import]] | [[Category:Data Import]] |
Latest revision as of 11:19, 5 October 2022
Contents
Overview
The following article outlines the procedure for migrating historical information from an external source into SmartSimple during the initial implementation of your system. Data migration involves the extraction, transformation, and then loading of data from one system into another.
For a successful migration, you will need the appropriate resources in order to first decide and plan which data you need migrated, and then to extract and transform this data from your existing system.
Note: The migration of your historical information is not necessary for a successful implementation of SmartSimple, and is not recommended for clients without the appropriate resources. There are exceptions to this however, where SmartSimple will assist or take on the full responsibility of the data migration for you. These exceptions are listed below.
Exceptions
MicroEdge GIFTS Clients
Clients transitioning from a MicroEdge GIFTS offline system may provide the full GIFTS database without performing any of the migration tasks listed below. SmartSimple has a standard procedure for directly transforming a GIFTS database into an appropriate format for import.
Simple Datasets
For clients with a sufficiently small or simple set of data, SmartSimple will assist you with the migration tasks listed below or even carry out the full data migration for you.
Migration Process
Prerequisites
- You must have resources with the authority, and expertise capable of determining which historical data need to be migrated.
- You must have resources with the ability to review, and ensure the data integrity of historical data.
- You must have resources with the ability to extract, and transform historical data into acceptable formats for import into SmartSimple.
Client Responsibility
Review of Current System
You will first need to determine which information you wish to migrate into SmartSimple. While it may be tempting to transfer everything, it is recommended that you review your data carefully, and only transfer what is essential. The exercise of reviewing your current data will better inform your requirement specifications while designing your new SmartSimple system. This translates to a better implementation of, and successful transition to, SmartSimple.
Review of Data Integrity
You are responsible for ensuring the data integrity, and quality of data that is imported into SmartSimple. It is recommended that the remediation of any issues be performed directly on your previous system prior to data export. This will help to improve the repeatability of the data extraction process. For a successful migration, you will need to ensure the following.
Required:
- Entity Integrity - every record set must possess a unique identifier for each record. This can be comprised of multiple columns. For example, for contacts, emails must be unique and contain no duplicates.
- Referential Integrity - every related record set must be linked by their respective unique identifiers. For example, a contact associated to an organization should be indicated by the contact record containing the unique identifier for the organization.
- Standard Data Type Formatting - standard data types should be formatted appropriately. For example, date values should be formatted to 'YYYY-MM-DD', and date/time values to 'YYYY-MM-DD hh:mm:ss'. Number values should be formatted to decimals '######.####' without currency, comma, or other formatting symbols.
Recommended:
- Other Field Formatting - if you wish to validate and enforce formatting conventions beyond the strict data types above, you will need to ensure that your data is similarly formatted. For example, if you wish to format phone numbers to '(###) ###-####', or zip codes to '#####-####' your data must be similarly formatted first.
- Picklist Integrity - for any fields that should be restricted to a pre-defined picklist of values, you should ensure the integrity of these field values. For example, address states should all be the full name, e.g. 'New York', or all be the abbreviation, e.g. 'NY', but not a mix of both, and not contain spelling errors or typos.
- Duplicate Checking - while you may satisfy that every record possesses a unique identifier, your data may still contain duplicate records. These should be resolved with duplicates being merged or deleted. For example, consider records that may have misspelled information e.g. organizations named 'Oranges Inc.' and 'Oranges Incorporated'.
Data Extraction
Data may be provided to SmartSimple in any of the following formats:
- Plain Text Files (CSV, XML, JSON formatted)
- Microsoft Excel
- Microsoft Access Database
- Microsoft SQL Server Database (Database size < 10200 mb)
- MySQL Database
- Oracle Database
Recommended:
- Export using UTF-8 character encoding to avoid encoding issues with special characters and language support.
- Convert any NULL values instead to empty values to avoid importing 'NULL' as a text.
Constraints:
- If providing data in plain text format, please ensure strict formatting is adhered to with the escaping of delimiter characters. For example, a CSV should have all values encapsulated by double quotes and escape any quotes that exist within a value.
Example:
The below is an example of a correctly formatted 3-column CSV that accounts for line breaks and escapes embedded quotes.
"Organization Name","Organization ID","Description" "Oranges Inc","4321","Peel. Eat. Repeat." "Bananas Ltd","8756","You're going to go ""bananas"" over our product!"
Data Transformation
Data Schema
Required:
Data must be provided in a normalized form such that the information for each record type be represented in separate tables or files. Typical record types are defined by the following tables or files, but are not limited to:
- Organizations
- Organization Addresses
- Organization Associations to Grants
- Contacts
- Contact Addresses
- Contact Associations to Grants
- Grants
- Grant Reviews
- Grant Progress Reports
- Grant Payments
- Other Grants Activities
- Funds
- Budgets
Example:
For example, a common guideline would be the below.
- All contacts must on one worksheet with a unique identifier for each contact and a matching identifier for their organization (for those associated with organizations)
- All organizations must be on one worksheet with a unique identifier for each organization.
- All grants must be on one worksheet with a unique identifier for each grant and a matching identifier for the applicant contact and organization for the grant.
- If there are multiple contacts associated to a grant (e.g. applicants, reviewers), this must be provided on a separate worksheet with the grant unique identifier, the contact unique identifier, and the role of the contact association.
- If there are multiple organizations associated to a grant (e.g. payees), this must be provided on a separate worksheet with the grant unique identifier, the organization unique identifier, and the role of the organization association.
- All payments must be on one worksheet with a unique identifier for each payment and a matching identifier for the grant against which the payment is made.
- All progress reports for import must be on one worksheet with a unique identifier for each progress report and a matching identifier for the grant against which the progress report is made.
Constraints:
- Maximum number of columns in any given table or file should be limited to under 100. If you require more than 100 columns, then this should be represented by multiple tables or files that each contain the unique identifier that identifies the record.
File Attachments
In order to migrate file attachments into SmartSimple you will need to provide two items.
Required:
- A zipped file of all files attachments to be uploaded.
- A mapping file to match files to records. This will simply be CSV sheets that describe which record each file should be uploaded to. The CSV should contain columns for the unique identifier of the record to upload to, the record type (e.g. organization, contact), and the relative file path for each file.
Example:
For example, given a zip file whose contents look like the below.
fileattachments.zip
- grants\
- grant12345\
- grant_budget.docx
- grant_request.docx
- grant45678\
- grant_budget.docx
- grant_request.docx
- grant12345\
- organizations\
- Oranges Inc\
- org_mandate.docx
- Bananas Ltd\
- org_mandate.docx
- Oranges Inc\
An appropriate mapping file is shown below:
The unique identifier column being a reference to the unique identifier of each record and also include the field id of where the files will be uploaded into if it will be uploaded into different fields.
Unique Identifier | Object | Relative path + file name | Field ID |
12345 | Grant | \grants\grant12345\grant_budget.docx | 2023123 |
12345 | Grant | \grants\grant12345\grant_request.docx | 2023123 |
45678 | Grant | \grants\grant12345\grant_budget.docx | 2023123 |
45678 | Grant | \grants\grant12345\grant_request.docx | 2023123 |
4321 | Organization | \organizations\Oranges Inc\org_mandate.docx | 1547893 |
8756 | Organization | \organizations\Bananas Ltd\org_mandate.docx | 1547893 |
Constraints:
- File size of each individual file <= 1GB
- Length of full path description < 255 characters
- File names must consist of characters within the UTF-8 character set. It is recommended to avoid all special characters within file names to avoid issues across different platforms as reserved characters differ between Windows, *nix,...etc. platforms.
- There is no file versioning when importing file attachments into SmartSimple. Duplicate files that are to be attached to the same record will overwrite each other. For example, in the above example, if the two lines with respect to grant 45678 were instead attributed to grant 12345, then the result would only be a single grant_budget.docx file and a single grant_request.docx file instead of four files total.
SmartSimple Responsibility
Data Transform
As the project team designs your SmartSimple system and architecture, we will work closely with you to design the final data translation to migrate your data into SmartSimple. SmartSimple will create a transform mapping and review this with you.
Data Load
During the course of a project implementation there will be two data loads conducted. One at the beginning of the project to allow you to verify the information that is input into SmartSimple. A summary report will be provided to you to review.
The second data load will be at the end of the project for the final migration from your current system to SmartSimple. For this final data load, you can expect between a 3-5 day maintenance window where we ask you to freeze any updates to your current system in order to cut over and transition to SmartSimple.
Notes
System Generated Fields
Standard Fields such as Created Date, Created By, Modified Date, Modified By are system generated fields. To preserve the integrity of data, these fields can not be modified thru data import. If the client wants to import other “representative” dates from some other system then these could be tracked in custom fields.
Advanced Data Table
Limitations when importing L2 records that will be displayed in L1 data table:
- The imported Advanced Data Table records are not displayed until the table is manually saved one time
- The syntax to display the Advanced Data Table records must be hardcoded and revised if the dependent XML field is ever changed versus using the @XML reference that will automatically include any changes