Data-Acquisition Preprocessor
Data-Acquisition Preprocessor on t6 is a process done in 8 distinct phases prior to data analysis itself.
Tagged on #preprocessor, #feature,
Phase 1-3 : Filter and securize
- Signature validation aims to make sure the payload originated from the expected source and has not been tampered.
- Encryption aims to make sure only t6 api is able to read the content of the payload.
The two first steps (sigature & encryption) can be combined together. On the third step, any payload that does not fit either signature validation or encryption read will be rejected and not performed in the Data Integration process.
Phase 4 : Processors Modules
Each Flow can contain the preprocessor
attribute (optional) as an array of preprocessor(s). Or, by adding a Datapoints, the payload can have this preprocessor attribute as well. Payload preprocessor
is overwritting the Flow when both are having the attribute.
During phase 4, preprocessor can do several modifications and controls on measures: transformation, convertion, sanitization, validation and Automatic identification and data capture (AIDC).
Transformation processors
Incoming values posted to Flows can be modified by 1 or multiple preprocessor(s). So before it comes to the Rule engine and before any storage on TimeseriesDb. The available transformers are the following:
On th following table, the input is Consequatur quis veniam natus ut qui.
.
Mode | Example output |
---|---|
camelCase | consequaturQuisVeniamNatusUtQui |
capitalCase | Consequatur Quis Veniam Natus Ut Qui |
constantCase | CONSEQUATUR_QUIS_VENIAM_NATUS_UT_QUI |
dotCase | consequatur.quis.veniam.natus.ut.qui |
headerCase | Consequatur-Quis-Veniam-Natus-Ut-Qui |
noCase | consequatur quis veniam natus ut qui |
paramCase | consequatur-quis-veniam-natus-ut-qui |
pascalCase | ConsequaturQuisVeniamNatusUtQui |
pathCase | consequatur/quis/veniam/natus/ut/qui |
sentenceCase | Consequatur quis veniam natus ut qui |
snakeCase | consequatur_quis_veniam_natus_ut_qui |
upperCase | CONSEQUATUR QUIS VENIAM NATUS UT QUI. |
aes-256-cbc | e.g.: f5596c8aa3bfaf03882554760518e4b7:f30aaa6f54e175d2a4579eb228[... ...]9469810ba0a495d1643 . The string before : refers to vector iv |
In the case of mode=“aes-256-cbc”, the payload must contains an extra attribute: object_id
with the value of the Object which knows the secret.
E.g:
"preprocessor": [
{
"name": "transform",
"mode": "snakeCase"
}
]
An additional transform preprocessor is flexible enough to susbstitute/replace string based on regexp. E.g:
"value": "John Smith",
"preprocessor": [
{
"name": "transform",
"mode": "replace",
"pattern": "(\\w+)\\s(\\w+)",
"replacer": "$2, $1"
}
]
As a consequence t6 will transform the passed value from John Smith
to Smith, John
.
Convertion processors
The convert preprocessor is a very simple unit converter for the main following units:
- time
- distance
- mass
- volume
- storage
- things
- temperature not implemented
E.g:
"preprocessor": [
{
"name": "convert",
"type": "distance",
"from": "km",
"to": "m"
}
]
Sanitization processors
The sanitize preprocessor is aiming to make sure the value is using the expected Datatype.
Note: This sanitization is forced from t6 according to the Flow, but it can also be added manually if you’d manually need to sanitize prior any other processor.
E.g:
"preprocessor": [
{
"name": "sanitize",
"datatype": "float"
}
]
An additional attribute can be added when Adding a [Datapoint](/features/Datapoints) to *t6* and containing the uuiv-v4 of the corresponding [Datatype](/features/data-types/). In this case the (automatically added) Sanitization will use the specified [Datatype](/features/data-types/).
E.g:
```json
"data_type": "e7dbdc23-5fa8-4083-b3ec-bb99c08a2a35",
Validation processors
Validation on preprocessor aims to validate the value sent to t6 and/or reject from any storage in case the value does not pass validation test. Validation allows Decision Rule to follow up on the value.
The validation are the following :
Test | Example input | Result |
---|---|---|
isEmail | rejectedEmail@______domain.com | ❌ due to invalid domain name |
isEmail | AcCePtEdEmail@domain.com | ✅ |
isAscii | Rejected char 😀 | ❌ due to invalid character |
isBase32 | aBCDE23= | ❌ |
isBase32 | 89gq6t9k68== | ✅ |
isBase58 | aBCDE23= | ❌ |
isBase58 | a4E9kYnK== | ✅ |
isBase64 | aBCDE23== | ❌ |
isBase64 | QmFzZTY0== | ✅ |
isBIC | ABC-FR23GHI | ❌ |
isBIC | ABCFRPP | ✅ |
isBoolean | Truezz | ❌ |
isBoolean | True | ✅ |
E.g:
"preprocessor": [
{
"name": "validate",
"test": "isEmail"
}
]
Automatic Identification and Data Capture (AIDC)
AIDC preprocessor expect to deals with images preprocessing to identify objects, faces and facial expressions.
AIDC modes are the following :
Mode | Purpose |
---|---|
faceExpressionRecognition | Identify the best match expression (neutral, happy, sad, angry, fearful, disgusted, surprised) from the 1st face detected in the image |
genderRecognition | Identify the gender (male, female) from the 1st face detected in the image |
ageRecognition | Identify the age from the 1st face detected in the image |
E.g:
"preprocessor": [
{
"name": "aidc",
"mode": "faceExpressionRecognition",
}
]