Race is a word that comes with cultural, phenotypic and genotypic connotations. This mix of attributes usually leads to controversy and confusion. Sometimes at the degree that some people claim that the concept of race is meaningless, and therefore should be eradicated from language. But, if this word is so meaningless, why is it still in the center of so many discussions and studies?
Some people would argue that the motivation for keeping alive the concept of race, only comes from the desire to push certain political agendas. And although this sometimes can be true. It also can be understood as the simple human inclination to explain patterns, and give structure to our conceptions of the world. With the concept of race arising from the legitimate question of how populations relate to each other.
People tend to group elements based on shared properties between those elements. It is just how the human mind works. And the most immediate traits our ancestors could perceive about each other were properties such as height, body shape, skull morphology, skin color, facial features, etc. Being natural for them trying to classify people based on physical properties.
Now we know that physical traits are not so reliable as a way to determine biological similarity between two populations, since physical properties can be a result from adaptative changes to similar, but far away environments. For example, skin color is more related to a population’s geographic distance to the Equator, as melanin pigmentation protects against sunny weathers with high UV radiation. So, we can find some populations with similar skin color in Sub-Saharan Africa and Melanesia, but without those two populations sharing a relatively recent ancestor.
Furthermore, the advances of molecular biology have provided us with much more precise tools to establish biological similarity between populations or even individuals.
DNA molecules are composed by 4 smaller molecules, arranged one after each other in sequences that can be interpreted as strings of letters, assigning “C” for cytosine, “G” for guanine, “A” for adenine, and “T” for thymine.
The human genome is composed by approximately 3,234 million of this “letters”. Every person possess a copy inherited by her/his mother, and another by her/his father, resulting in every individual possessing approximately 6,468 million of this letters ordered in specific patterns in their cells. Encoding a great amount of information about how cells work, and the mating events that produced each one of us.
DNA replication isn’t perfect, and some letters will get replaced by other letters in the copies inherited from parents to their children, producing new unique strings of letters, that can be associated to a specific individual. Then, when this individual reproduces, she/he inherits this unique sequences to her/his children, which can be identified as descendants from this individual. Due to this process, we can identify the unique DNA sequences that are shared among a population, and make approximations about how long ago their common ancestors were alive.
The approximately 3,234 letters inherited by each parent to their children are made by randomly taking DNA sequences from the 6,468 total letters in each of the 2 parents, resulting in new unique copies inherited to each children, with each copy as a mosaic made from half the genetic material of each parent. So, we didn’t inherit neither of the copies passed from our grandmothers and grandfathers to our parents, but a mix of these copies into new ones. That’s why brothers and sisters can be very similar or different. Because those who are similar are likely to have inherited similar mixes from the genetic material of their parents, and those who are different, are likely to have inherited less similar mixes. Plus, the creation of new sequences in the replication errors mentioned before.
All this processes make each one of us a really complex mosaic of DNA sequences, mixed each generation, which result in difficult analysis to reconstruct our past. Fortunately, this can be simplified by focusing in some special DNA sequences, which are not as affected by this mixing process. This special DNA sequences are present in DNA packages that don’t follow this mixing process and then the package inherited from the parents would be almost identical to the copies inherited from the grandparents. This two DNA packages are the Y-chromosome which is inherited from every father to his sons, and the Mitochondrial-chromosome, which is inherited from each mother to her daughters and sons. The Y-chromosome contains approximately 58 million letters, while the Mitochondrial-chromosome contains approximately 16,569 letters, making the Y-chromosome much more informative, although with the disadvantage of being present only in men.
Still, those special DNA packets are a good way to infer ancestry, and therefore to take a little look into our pasts and origins, tracing back some of the expeditions our ancestors went through, or even better understand our biology, and how to avoid or cure diseases.
The Y-chromosome and Mitochondrial-chromosome, have some well known and characterized DNA sequences, called the Y-chromosome haplogroups and the Mitochondrial-chromosome haplogroups. Those haplogroups allow us to know the geographic region where an haplogroup originated and how long ago it happened. By sequencing our Y-chromosomes and Mitochondrial-chromosomes we can take a look to our patrilineal and matrilineal histories.
Our respective haplogroups in the Y-chromosome, for example, will tell us that if someone have any of the haplogroups A00, A0-P305, A1a-M31, A1b-M6 or B-M60, it’s likely that this person have a recent ancestor from Africa, and even further region specificity can be inferred, like the association of A00 with the Bangwa people in Cameroon or that one of the A1b-M6 with the Khoisan people in Southern Africa. So, haplogroups can be used to infer ancestry and relationship with some populations.
People has been always fascinated about their origins, and we could welcome that curiosity by encouraging people to try to reconstruct their biological past, and knowing that every carrier of haplogroup C-M130 is likely to share a recent indigenous ancestor from Mongolia, Siberia, North America or Australia. Or how the D-M174 haplogroup originated in Asia, and then asian descendants are likely to possess this haplogroup.
It would likely be interesting for people having E-M96 haplogroup to know that their ancestors originated in Africa, but then travel outside the continent towards Asia, only to return to Africa again. And also finding certain DNA sequences correlated with each haplogroup could be useful to develop better prophylactic and treatment methods, or even to decode some biological gems of functionality hiding in there.
Are haplogroups race separators? They could be if we give them that meaning, and we could do the same for other DNA sequences, accepting a broader term of race with the right amount of malleability to adapt to the scope of each study, but with the objective meaning that it indicating a genetic pattern found in certain populations, without doing any a priori assumptions about the phenotype or cultural values of those populations.
Instead of trying to banish the word “race” from language, maybe we could accept its legitimacy in the form of questions like, which phylogenetic paths our ancestors followed, how close we are to certain populations, and inferences about our biology. Accepting that we have an amazing diversity, which allow us to find differences between even identical twins, and similarities between people in different continents. We want to make sense of all these relationships, just out of curiosity, or even for the development of medicine and biology. And in the end, maybe overwriting the race categories into biologically meaningful patterns, and encouraging people to understand those, could be the only way to clean old prejudices and ignorance.