Joining data via INNER JOIN, LEFT JOIN, RIGHT JOIN and FULL JOIN
We already learned about the possibility of adding a new column when manipulating-columns and to change it by a Look Up to another data source. This is useful when only a single column is involved. The join step goes one step further. Here, two table-like datasets are linked together. You know this technique in the field of relational databases, such as a SQL or Oracle database. There it works the same way.
The source table of the current data flow is always seen as the “left” table. The newly added (i.e. joined) table is the “right” table. The following example shows the addition of an additional table with vice presidents to the well-known example of American presidents. In the lower area you have to define the columns where the join takes place. In the example there is only one link, namely the column “FullName” in the original table and the column “Name” in the linked table. Both contain the full name of the respective president and serve as key attribute for the join step.
The Join Type defines the logic used to join the two tables. Here, too, we are guided by the usual terms of relational connections:
The result contains only the rows where there is a match in both tables. This is the most common method. However, rows will be lost if there are rows in the left or right table that do not have a match in the other table.
No rows will be lost. If there are rows without a match in the left or right table, the columns of the other table will remain empty in this case.
The rows of the left table are completely preserved. The rows of the right table only find their way into the result if they have a counterpart on the left.
The rows of the right table are completely preserved. The rows of the left table only find their way into the result if they have a counterpart on the right side.
The following example nicely shows a real full join. The first row is only present in the original table, so the location remains empty. The last row is missing in the original table, so the customer and location from the right table are displayed. All other columns are either empty or 0 (depending on whether it is a string or a number).
You can join not only data sources, but of course other dataflows as well.