Some examples relate to data deduplication. In an example, upon addition or modification of a data unit in a data storage device, a Context Triggered Piecewise Hash (CTPH) key may be generated for an added or modified data unit. CTPH key of the added or modified data unit may be compared with a group CTPH key for each of a plurality of groups of data units stored in the data storage device to identify a group whose group CTPH key is within a pre-defined edit distance from the CTPH key of the added or modified data unit. A duplicate of the added or modified data unit may be identified within the identified group.