Configuration Json data¶
The configuration governs not only where to find data, but also the structure of the output which will mirror the structure in the configuration json.
The two main components of the configuration json is the object and attributes. An object can contain nested objects and/or attributes. In the attribute part of the file is where you actually tell the mapper where to find data. In the object you are deciding the structure and also telling the mapper if there are iterable data anywhere that needs to be iterated to create multiple instances.
Object¶
An object has a name, it can have attributes, nested objects or a special type of objects called branching objects. It will also know if itself is an array and the path to where the input data can be iterated to create multiple objects.
name | type | description | comment |
---|---|---|---|
name | str | name of the key it will get in parent object | the root will not get a name |
array | bool | tells the mapper if this should be an array or not | |
iterables | array[iterable] | Lets you iterate over lists in input data and apply configuration to every iteration of the lists | |
attributes | array[attribute] | An array of this objects attribute mappings | |
objects | array[object] | Here you can nest more objects. | |
branching_objects | array[branching object] | Array of a special kind of object | rarely used |
{
"name": "object_name",
"array": true,
"iterables": [],
"objects": [],
"branching_objects": [],
"attributes": []
}
Iterable¶
Iterables are your bread and butter for looping lists in the input data. For each an every one of the elements in the list we apply the current object and its childrens mapping configuration.
name | type | description |
---|---|---|
alias | str | The aliased name the current iteration will get in the data which mapping is applied to |
path | array[str|int] | path to the iterable list/array in the input data |
{
"alias": "keyname",
"path": ["path", "to", "list"]
}
Explanation of Iterables coming
Attribute¶
The attributes are like 'color' of a car or 'amount' in an invoice. Attributes are have a name ('amount'), a number of mappings, separator, if statements, casting and a default value if all else fails.
name | type | description | default |
---|---|---|---|
name | str | The name it will get in the parent object | |
mappings | array[mapping] | list of mapping objects which is where to find data | [] |
separator | str | string to separate each value in case multiple are found in mapping step | '' |
if_statements | array[if statement] | If statements that can change data based on conditions | [] |
casting | casting | Lets you cast data to a spesific type [int, decimal, date] | {} |
default | Any | If after all mapping, if statements and casting the result is None this value is used | None |
{
"name": "attribute_name",
"mappings": [],
"separator": "",
"if_statements": [],
"casting": {},
"default": "default value"
}
Mapping¶
This is the only place where actual interaction with the input data is done.
name | type | description | default |
---|---|---|---|
path | array[str|int] | path to data you want to retrieve. | [] |
if_statements | array[if statement] | If statements that can change data based on conditions | [] |
default | Any | If no value is found or value is None after if_statements then this value is used | None |
Note
either path
or default
must contain a something
Explanation of path¶
You add a list of strings
or integers
that will get you to your data. so for example if you needed to get to the second element in the list called my_list
in the following json then your path
will be ["my_list", 1]
and you will get the value index1
{
"my_list": ["index0", "index1"]
}
- if_statements: list of if statements that can change the data depending on conditions
- default: a default value if none is found or value found is
None
{
"path": ["path", "to", "data"],
"if_statements": [],
"default": "default"
}
input({'path': { 'to': { 'data': 'value'}}}) -> 'value'
input({'path': { 'does_not_exist'}}) -> 'default'
input() -> 'default'
Regexp¶
Let's you match values by certain patterns in search
and specify what matches and in what order to return in group
.
Note
The default group
value is 0
. It will return the first match. To return all values group
should be equal to []
. You can also specify group
as [1, 3, 2]
, which will return 2nd, 4th and 3rd matched elements in exact order. To return 5th element from the end, you would need to get all matches []
and slice over array. Slicing over array's issue
name | type | description | default |
---|---|---|---|
search | string | Pattern for string matching | |
group | integer|array | Index/-ces of matching sinstrings to return | 0 |
{
"search": "(i\w)",
"group": [0, 2, 1]
}
input('Vladimir Kramnik') -> ['im', 'ik', 'ir']
Note
Values are returned as a string when group
is integer or as an array of strings if group
is an array.
Slicing¶
Lets you slice a value from index from
to index to
. Slicing is implemented exactly like pythons string[x:x] slicing. This means that when from
is negative you count back from the end, and if to
is null
or left out then we consume the rest of the string.
name | type | description | default |
---|---|---|---|
from | int | Where to cut value from counted from 0 |
|
to | int|null | To what index we cut to, leave key/value out or set value=null to go to end of string |
{
"from": 1,
"to": 3,
}
input('hello') -> 'el'
Note
All values are turned into string
before slicing is applied. This lets you also slice any values independant on their original type in the input data. If the config Slicing object is empty, this str casting is also skipped. Any result after slicing is also a string. So if you need a different format use casting to change it
If Statement¶
This is where you can change found(or not found) data to something else based on a condition. They are chained in the sense that what the first one produces will be the input to the next one. Thus if you want the original value if the first one fails, then leave out otherwise
name | type | description | default |
---|---|---|---|
condition | one of ["is", "not", "in", "contains"] | What condition to use when checking value against target |
|
target | str|number|bool|array|object | Target what we do our condition against ie: value == target when condition is is |
|
then | str|number|bool|array|object | value that we will return if the condition is true | |
otherwise | str|number|bool|array|object | Optional value that we can return if the condition is false | None |
{
"condition": "is",
"target": "1",
"then": "first_type",
"otherwise": "default_type"
}
input('2') -> 'default_type'
input('1') -> 'first_type'
If statements work a bit different depending on what value you expect to get. This table shows the exact operators used for all valuetypes
condition | valuetype | code | comment |
---|---|---|---|
is | all | value == target |
|
not | all | value != target |
|
in | array|object | value in target |
target must be an array or a string. where value must then equal one of the array items or a substring when target is a string. This means that the array values can be arrays or objects for when checking against array and object values |
contains | object | target in value |
For objects this means that the target is a key in object |
contains | array | target in value |
For arrays this means that target equals one or more item in the array |
contains | rest | target in str(value) |
For the rest we string cast the value so that we can do this check for the rest of the types |
Casting¶
The casting object lets you cast whatever value is found to some new value. Currently integer, decimal and date are supported and original format is optional helper data that we need for some special cases where the format of the input value cannot be asserted automatically.
name | type | description | default |
---|---|---|---|
to | one of ["integer", "decimal", "date"] | What type to cast the value to | |
original_format | "integer_containing_decimals" or spesific date format(see below)" | For some values we need to specify extra information in order to correctly cast it. | None |
about original format
Note
When to
is date
then original_format is required
.
when to is | original format | description |
---|---|---|
decimal | integer_containing_decimals | is used when some integer value should be casted to decimal and we need to divide it by 100 |
date | yyyy.mm.dd yy.mm.dd yymmdd dd.mm.yyyy dd.mm.yy ddmmyy |
The format of the input date. . means any delimiter. Output is always iso-date yyyy-mm-dd |
Examples
{
"to": "decimal",
"original_format": "integer_containing_decimals"
}
"10050" -> Decimal(100.50)
{
"to": "date",
"original_format": "ddmmyyyy"
}
"01012001" -> "2010-01-01"
Branching Object¶
The branching object is a special object that does not have attributes or object childs but has a special branching_attributes child. The point of this object is to make sure that we can map data from different sources into the same element. for example, we have an object called "extradata" with the attributes 'name' and 'data'. This is kind of a field that can be many things. like 'name' = 'extra_address_line1', and another one with 'extra_address_line2'. This must then get its data from different places, and thats what these branching objects are for.
name | type | description |
---|---|---|
name | str | Name of the object |
array | bool | if it should be an array or not |
iterables | array[iterable] | Lets you iterate over lists in input data and apply configuration to every iteration of the lists |
branching_attributes | array[array[attribute]] | list of list of attributes where each list of attributes will create a branching object. |
Example
{
"name": "extradata",
"array": true,
"branching_attributes": [
[
{
"name": "name",
"default": "extra_address_line1"
},
{
"name": "data",
"mappings": [{"path": ["list", "to", "line1", "value"]}]
}
],
[
{
"name": "name",
"default": "extra_address_line2"
},
{
"name": "data",
"mappings": [{"path": ["list", "to", "line2", "value"]}]
}
]
]
}
This will produce:
{
"extradata": [
{
"name": "extra_address_line1",
"data": "address value 1"
},
{
"name": "extra_address_line2",
"data": "address value 2"
}
]
}