Skip to content

Add grouping feature by uid attribute#6

Open
wadahiro wants to merge 1 commit intoEvolveum:masterfrom
openstandia:feat-group-by
Open

Add grouping feature by uid attribute#6
wadahiro wants to merge 1 commit intoEvolveum:masterfrom
openstandia:feat-group-by

Conversation

@wadahiro
Copy link
Copy Markdown

@wadahiro wadahiro commented Oct 2, 2023

I added groupByEnabled option, which uses the uid attribute for grouping. Enabling this option allows CSV records with multiple rows per account to be aggregated by uid attribute and read as JSON string.

Motivation

In our experience with IDM projects, when importing data into the IDM system via CSV, we often deal with multiple rows per account. For example, the following csv:

id;name;dept;title
1;john;abc;engineer
1;john;efg;manager
2;jack;abc;manager

In the above example, the first row means that john belongs to the abc department as engineer, and the second row means that john belongs to the efg department as manager.

The current CSV connector cannot handle this kind of one-account, multiple rows data. Therefore, in this use case, we currently had to create custom BulkAction to read CSV files and update midPoint user data.
We would be happy if the CSV connector natively supports such use cases, so I have created this pull request.

How does this option work?

When groupByEnabled is enabled, then the executeQuery method is called, it will perform grouping by the value of the Uid and map it as a JSON string to a special attribute, __RAW_JSON__. For example, in the CSV john example above, the following JSON string is mapped. Note that other attributes are mapped from the first row of the grouped rows.

[
  {
    "name": "john",
    "id": "1",
    "dept": "abc",
    "title": "engineer"
  },
  {
    "name": "john",
    "id": "1",
    "dept": "efg",
    "title": "manager"
  }
]

Example inbound mapping on midPoint side

Here is an example of inbound mapping of a CSV resource definition that searches for OrgType by the value of the dept column in the CSV and assigns that organization. The value of the title column of the CSV is set to the subtype of the assignment to express which position in the organization is being assigned.

            <attribute>
                <c:ref>ri:__RAW_JSON__</c:ref>
                <inbound>
                    <expression>
                        <script>
                            <code>
                                import groovy.json.JsonSlurper
                                import com.evolveum.midpoint.xml.ns._public.common.common_3.*
                                
                                json = new JsonSlurper().parseText(input)
                                assignments = json.collect { [midpoint.searchObjectByName(OrgType.class, it.dept), it.title] }
                                    .findAll { it[0] != null }
                                    .collect {
                                        ref = new ObjectReferenceType()
                                        ref.setOid(it[0].oid)
                                        ref.setType(OrgType.COMPLEX_TYPE)

                                        assignment = new AssignmentType()
                                        assignment.setTargetRef(ref)
                                        assignment.subtype("dept")
                                        assignment.subtype(it[1])
                                        return assignment
                                    }
                                
                                log.info("created assignments: {}", assignments)
                                
                                return assignments
                            </code>
                        </script>
                    </expression>
                    <target>
                        <path>$focus/assignment</path>
                        <set>
                            <condition>
                                <script>
                                    <code>
                                        assignment?.subtype?.contains('dept')
                                    </code>
                                </script>
                            </condition>
                        </set>
                    </target>
                </inbound>

Enabling the groupByEnabled option allows CSV records with multiple rows per account to be aggregated by uid attribute and read as JSON string.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant