Here is an example of how you can generate a wordset, using the harmonizome plugin.
In this example, let's assume you have an exome at your disposal, and that you know what the patient suffers from.
Harmonizome is a great database that references, to date, 114 gene ontology databases. Among them, Reactome pathways, HPO gene-disease association or KEGG pathways just to name a few. These databases all provide association between an 'attribute' and a gene name (see the about page on harmonizome website). This makes it very useful to reliably create gene sets, making sure that you always get gene names for the attributes you are looking for.
Find the symptoms in HPO database#
Let's say your patient has headache. Open up the harmonizome plugin (Tools/Create wordset from harmonizome database...), and search for 'HPO' in the dataset search bar.
Now, if you type in 'headache' in the second search bar, it will look for the name of the symptom.
This is only true for HPO, gene-disease association. Other databases refer to gene sets using their own toponymy.
Now that you've searched for 'headache', you will end up with several list items with the complete name of the symptom. If you click on one of them, you will find another list view, on the right, with the list of genes associated with the symptom, according to HPO database.
Select the genes from the resulting set#
In the previous step, we just saw how you can search for a gene ontology database, then selecting a gene set according to the phenotype you observe in your sample. In this step, we will guide you through the generation of the wordset itself, using the harmonizome plugin.
As you can see below, you can select all the genes you wish in the rightmost view, using Ctrl + A.
With those genes selected, you can add them to the selection using the button as shown below.
Once you've added the genes to the selection, you can repeat any of the previous steps, as many times as you may need. For example, you may want to select another geneset from the same database, or even choose from another dataset.
Every time you add genes to the set, if the same gene is already in the selection, it will not be added (there are no doubles)
Now, when you're happy with your selection, you can create a wordset that will contain every gene in the selection, by pressing the button as shown below.
This will open a simple text prompt so that you can name the wordset as you like.
You're done! What you can do next is use the generated wordset as indicated here and here.