You will find a user guide with practical information about how to navigate and explore the lexical networks, Frequently Asked Questions, and instructional videos. If you encounter a problem not covered in this page, please contact Team 1 leader, Saana Svärd.
Aleksi Sahala, Heidi Jauhiainen, Tero Alstola, Sam Hardwick, Ellie Bennett, Tommi Jauhiainen, Krister Lindén and Saana Svärd “ANEE Lexical Networks v.2.0.”. URN: http://urn.fi/urn:nbn:fi:lb-2022100301.
When citing a specific lexical network, please cite it as: Aleksi Sahala, Heidi Jauhiainen, Tero Alstola, Sam Hardwick, Ellie Bennett, Tommi Jauhiainen, Krister Lindén and Saana Svärd “ANEE Lexical Networks v.2.0.”. See esp. “[title of the network]”. URN: http://urn.fi/urn:nbn:fi:lb-2022100301
Therefore, to cite the English version of the Neo-Assyrian network in Assyrian, it should be cited as: Aleksi Sahala, Heidi Jauhiainen, Tero Alstola, Sam Hardwick, Ellie Bennett, Tommi Jauhiainen, Krister Lindén and Saana Svärd “ANEE Lexical Networks v.2.0.”. See esp. “Neo-Assyrian texts in Assyrian with all Akkadian words (in English)”. URN: http://urn.fi/urn:nbn:fi:lb-2022100301.
The ANEE Lexical Networks v. 2.0 was designed to be easily understandable to those who are not intimately knowledgeable of Network Analysis. This guide is not intended to be an introduction to Network Analysis, but on how to use and navigate the ANEE Lexical Portals in order to maximize their potential for future research.
On the Lexical Portal Webpage, you will see there are 35 Lexical Networks for you to explore. The webpage explains the difference between each one, so this will not be covered in detail here. For the sake of consistency, the Help Page will use the Lexical Network titled “All Akkadian texts with all Akkadian words (in English)” for examples.
When you click on one of the Lexical Portal links, you will be taken to the Lexical Portal, where you will see a banner at the top of the page and a group of brightly coloured dots.
The dots are what are referred to in Network Analysis as ‘nodes’, and represent Akkadian words. They are sized according to a Network Analysis measurement called ‘weighted degree’. You can think of this as the sum of the PMI scores related to a single Akkadian word. The bigger the node, the higher their weighted degree score.
The position of nodes in the network is determined by their connection to other words (as determined by PMI). Normally this would be shown as a line (or ‘edge’), where the thicker the line, the more likely the two words it is connecting will appear close together in an Akkadian text.
You will notice that the edges and labels have been turned off in the default view. You will also notice the nodes are in several colours. These indicates sub-groups of words that are more connected to each other than the rest of the network. You can therefore get an immediate impression of trends and patterns in the network.
If you hover over a node, it will show you the nodes’ label, which has both the Akkadian and the English translation. In the English networks, the English translation is first, followed by the Akkadian. Hovering over a node also highlights the edges and nodes it is connected to. Once you have found a word of interest, you can get more information about it by left-clicking the node. The network will only show you the word you’re interested in, and the words it is immediately connected to.
Along the top of the Lexical Portals is a white header bar. There are useful tools and links within this header.
Items in the header are:
A search bar is provided to help in searches for particular words. Searches in Akkadian and English work in this search bar. As you type, a drop-down menu will appear with the possible matches. If the language matches the language of the network (for example, an Akkadian search in a network where the labels are in Akkadian), the drop-down list will show possible matches under ‘Nodes’. If you use, for example, English in an Akkadian network, it will also possible translations for the word you are typing under the heading ‘Translations’.
Once you have found the word you need, you simply click on the word. The network will then show you the word and it’s immediate neighbours, and the side panel will appear.
To the right of the searchbar are four links. One is the title of the network you are investigating. By clicking this, you return to the default view of the network. This can also be achieved by the link titled ‘Main Graph’.
‘Main Page’ will take you back to the webpage with all of the Lexical Portals. If you wish to do so to compare networks, we recommend you open this as a new tab in your browser.
‘Download’ allows you to download the network graph. This is recommended if your interests in the network extend beyond the tools provided by the Lexical Portal. Remember that if you wish to publish work based on this to correctly cite the Lexical Portal.
On the left of the header is a hyperlink to Gephi. If you wish to explore these graphs on your computer using more Network Analysis tools than those provided on the Lexical Portal pages, we recommend downloading the free and beginner-friendly Gephi. You can then proceed to download the graphs and view them at your leisure.
By downloading the network you will be downloading the entire network prior to any filtering you have done in the portal. When you click ‘Download’ you will see a pop-up asking where you want to save the network. It will be saved as a .gexf file, which is easily opened in Gephi.
Note: when you open the network file in Gephi, it may show you some warnings. These only refer to the colours in the network, and so it is safe to continue.
When you open the network in Gephi you will see that it has included the following: weighted degree; degree; modularity class; a search url for the word in Korp; a link to the word’s ego network in the lexical portal; a translation of the word (if this was an English graph, it will be in Akkadian, and vice versa).
At the bottom-left of the portal you will see several tools that will help you navigate and explore the network.
Items in the toolbar are:
This slider allows you to filter out nodes according to their size. In these networks, the size of the nodes correspond to their weighted degree. The slider therefore filters the network according to weighted degree. In the default view it is at the maximum, so all nodes are in view.
If you hover over the slider, you will see a pop-up that tells you what percentage of the network is visible, and how many nodes are visible.
You can either click and drag the slider to the percentage you would like, or you can click the orange ‘+’ and ‘-’ buttons at the top and bottom.
This slide allows you to zoom in and out of the network. In the default view, it is set to be completely zoomed out so you can see the entire network. You can click and drag the slider to the percentage you would like, or you can click the orange ‘+’ and ‘-’ buttons at the top and bottom. This function will only zoom in to the centre of the network image (not the network). You can also use your mouse to hover over an area of interest, and then use the scroll wheel to zoom in and out of the network.
This checkbox allows you to turn on all the labels in the network. The labels are, simply, the labels of the nodes in the network. These will identify what node, and therefore what Akkadian word, you are looking at in the network.
To make the labels visible, you click the checkbox. It will now be a box with a blue tick. The labels will then appear as black sans serif text hovering over their related node. To turn them off, you click again on the checkbox, and the box will return to a white square.
We recommend filtering the network before you turn the labels on. The sheer number of nodes in the networks mean the labels are so numerous they clutter the screen to the point where they obscure the network.
Edges are the lines that connect the nodes in the network. Here they represent the PMI score between two words. The thicker the line, the more likely those two words will co-occur in an Akkadian text.
In the default view the edges are turned off. This is so broad patterns and trends are easier to immediately view. To turn the edges on, click the ‘Show edges’ box so that is now a box with a blue tick. To remove the edges, click the box again, so it is now an empty white box.
At the bottom of the toolbar is the Label filter. You can enter a word, and this tool will filter out the network according to that word.
For example, if you type in ‘king’ (and then click ‘Show labels’ if they aren’t already on), you will see all the words that have ‘king’ as part of their label. This includes words like ‘locking’ and ‘undertaking’.
If you only want words with ‘king’ as its own word, you can use double quotation marks around the word in the label filter. For example, the search for ‘“king”’ returns only labels with ‘king’ as its own word.
A useful feature is if you would only like to see names of locations or placenames, you can use this filter. Simply type ‘(Loc)’, and the network will only display Akkadian words identified as place names.
To move around the network, simply click the background (right- or left-click works), and drag in the direction you would like the network to move.
On the bottom right of the portal is a minimap of the network. If you get lost, this can tell you where you are in the broader network. You can also use this to navigate. If you zoom in on the network, you will see there is a little red box in the minimap. This indicates where you are viewing in relation to the whole network. If you click this red box and drag it to where in the network you would like to move to, it will also move the network itself.
Placeholder for video
Once you have found a word of interest, you can get more information about it by left-clicking the node. When you do this, a sidebar appears on the left side of the screen. The network then filters everything except the node you are interested in and the nodes immediately connected to it.
Items in the sidebar are:
If you want to return to the view of the main graph, you simply click the grey background. This will return to the view of the whole network, but you will still be able to see the side panel.
If you would like to remove the sidebar from view, there is a small button with two orange arrows at the top right of the sidebar. Once you click this, the sidebar is removed from view. If you would like to bring it back, you can click the same button with two arrows (now pointing the other way) at the top left of the network view.
At the top of the sidebar is the word you have clicked. Next to it is a colored circle. This denotes the cluster (or ‘community’) the word is part of. If you click this colored circle, it will only show the words connected to the word you are interested in that are part of the same cluster. To return to all of the words connected to the word of interest, click the coloured circle again.
If you then click on the gray background, you will see all the nodes that belong to this cluster. If you click on the colored circle again, you will remove this filter and return to viewing the whole network.
Under this heading are some simple statistics about the word. It gives you the weighted degree and degree scores, as well as the number of the cluster it belongs to. The latter is called ‘modularity class’.
This link will take you to a Korp, where you can make complex searches of Akkadian texts stored on ORACC. It will immediately take you to a search for the highlighted word across all of ORACC sub-projects. A guide for how to make more complex and refined searches on Korp can be found at the metadata URN for Oracc in Korp, under the heading ‘Documentation’.
Once you have clicked a node, it and all the words it is connected to will maintain their position within the wider network. Whilst very useful, this can make it difficult to view all of the words connected to the word you’re interested in - especially if they are at extremes of the network.
By clicking ‘Go to this word’s ego graph’, you will be taken to a new graph. In this graph are the word you are interested in, and all of the words immediately connected to it. In the background, the layout algorithm is run again, and results in a new graph centred on your word of interest. In Network Analysis, this is called an ‘ego network’.
To get back to the main graph, you click the link in the header which is the name of the graph you are viewing.
This is the translation of the word you are viewing. In Akkadian graphs this translation is given in English. In English graphs this is given in Akkadian.
Under the headings ‘Linked words’ and ‘Linked non-personal proper nouns’ are all the words connected to the word of interest. The words have the same format. On the left is a coloured circle, which indicates which cluster they are part of. Then there is the word itself, which is hyperlinked.
If you hover over this linked word, you will see it highlighted in the network view alongside its immediate neighbours. This allows for quick comparisons with the word of interest. If you click on this link, you will then move to a view where this new word and its neighbours are highlighted.
Placeholder for video
Finally, there is a number in brackets. This represents the weight of the edge between the word of interest and the word in the list. This therefore represents the PMI score between the two words.
At the very top right of the sidebar are two black or blue arrows. These help you navigate between current and past sidebar views - i.e. between the details of the current word and past words you were looking at.
This has been partially based on the User Guide prepared by Sam Hardwick in 2020. The previous version was based on an older version of the Lexical Portal. It can be found here: https://traubert.github.io/lexical_portal/
35 networks were created so that as many people as possible can use these networks. Those only interested in 1st Millennium data may have different interests than those only interested in names in Akkadian texts. Furthermore, we wanted to make sure the networks were as accessible as possible, so the suite of networks were duplicated, but with the labels in English.
The 35 networks are divided into 3 categories: 1) a network only of proper nouns; 2) those where the language displayed in the network is Akkadian; and 3) those where the language displayed is English.
In each of the sections there are 3 further subsections: 1) networks built upon all texts in Oracc that are tagged as ‘Akkadian’; 2) networks built upon data for the 1st and 2nd Millennia (as tagged in Oracc); 3) and networks based on Neo-Assyrian texts in Assyrian and Standard Babylonian (as tagged in Oracc).
Finally, each of these subsections have three further subsections: 1) a network that includes all the words, 2) a network without proper nouns; and 3) a network with only proper nouns.
For example, there is a network that visualizes the PMI scores of Akkadian words from 2nd Millennium texts, but does not include proper nouns (“2nd millennium texts without proper nouns (in Akkadian)”).
The data used for the graphs has been downloaded as JSON files from Open Richly Annotated Cuneiform Corpus (Oracc) in June 2021. For the analysis we used a dataset consisting of 7,346 texts that have in Oracc been tagged as having been written in “Akkadian.” We standardized the spellings of divine and place names and removed duplicate texts following the procedure explained in Alstola et al. (2019). We only used dictionary forms, as defined in Oracc (following Concise Dictionary of Akkadian), of content words—nouns, verbs, and adjectives—while all the other words have been replaced with an underscore character as a placeholder. Since neither the cuneiform script nor the Oracc metadata indicates sentence endings, the text of each document is handled as one continuous line of text.
From all the lexemes in our dataset, we chose all those that appear at least 5 times. We then used Pointwise Mutual Information (PMI) to produce lists of the most semantically similar words to each of these 4930 lexemes. These lists were then visualized with Gephi.
To produce the networks based on different time periods, we relied upon the metadata provided by Oracc. Those texts tagged with names of periods from the 2nd millennium were collated together, and the same for those from the 1st millennium. For the Neo-Assyrian data, we collected the texts tagged as ‘Neo-Assyrian’, and then within this corpus divided the texts according to the metadata tags ‘Assyrian’ and ‘Standard Babylonian’.
Please note that the lexical portal is diachronically flat. Even our larger networks based on the 2nd or 1st Millennia, and the Neo-Assyrian period, will not reflect changes that happened within these time periods. In all instances, and in order to provide as much data as possible, we have relied on the metadata labels provided by Oracc. If you are interested in a particular set of data, we recommend using the Korp interface (Jauhiainen et al 2019) or downloading the full dataset on which these graphs are based and creating more specific networks from that data (Jauhiainen et al 2021).
We used the computational linguistic measurement called Pointwise Mutual Information (PMI). PMI detects words that co-occur frequently in the dataset. For example, in the sentence: “A comfortable chair is important for the whole family,” PMI can calculate co-occurrence probabilities for words that occur close to “chair” (e.g. “comfortable” or “family”).
The scores calculated by PMI were then imported into Gephi, a network visualization software. The connections, or edges, are therefore visualizations of these scores. Furthermore, the thickness of the edges indicate the PMI score. The thicker the line, the higher the PMI score between two words.
Further details of this approach can be found in the publications in the ‘Annotated Bibliography’ section of the webpage.
The colors in the network represent the communities within the larger network. In short, communities in Network Analysis are nodes that are more connected within a sub-section of the network than outside of the sub-section.
Each community is assigned a random color to make it easier for the human researcher to find these communities.
‘Modularity class’ can be found in the sidebar of the network view. It refers to the ID given to the community group. Communities in Network Analysis are nodes that are more connected within a sub-section of the network than outside of the sub-section. Each modularity class ID is assigned a random color. Sometimes the colors can be very similar, so the modularity class can help you differentiate communities.
‘Degree’ counts the number of connections (or edges) a node has. For example, in the sample network used in the User Guide (“All Akkadian texts with all Akkadian words (in English)”), the word “messenger” (našparu) has a degree of 15.
In our networks, degree is the number of PMI results associated with an Akkadian word. In our example, “messenger” (našparu) has 15 co-occurrences according to PMI.
‘Weighted degree’ is similar to degree, but takes the weight of the connections (or edges) into account. Instead of counting the number of edges, it sums the weight of the edges connecting a particular node. For example, in the sample network used in the User Guide (“All Akkadian texts with all Akkadian words (in English)”), the word “messenger” (našparu) has a degree of 15. However, when only adding together the weights of the edges, the weighted degree is 4.18.
This score can then be compared with other nodes’ weighted degree scores. In our network, weighted degree is the sum of the PMI scores associated with an Akkadian word. Comparing weighted degrees can therefore indicate the Akkadian words that have the most strong co-occurrences with other Akkadian words.
In the sidebar is a list of the words immediately connected to a selected Akkadian word. For each word there is a number in brackets after it. The number is the PMI score for the listed and selected word. For example, in the sample network used in the User Guide (“All Akkadian texts with all Akkadian words (in English)”), the word “messenger” (našparu) has “tired” (anīhu) in its list of linked words. The number in brackets next to “tired” (anīhu) is 0.76, which means the PMI score between “messenger” (našparu) and “tired” (anīhu) is 0.76.
The principles of Network Analysis have long been established, and there are several excellent textbooks available. We have found the following particularly helpful:
Please refer to the ‘Annotated Bibliography’ section of the Lexical Networks page for publications by members of Team 1 using Lexical Networks to answer Assyriological questions.
Some of the Akkadian words have “(Loc)” included in their label. This indicates it is the place of a location.
At the top of the sidebar is the word you have clicked. Next to it is a colored circle. This denotes the cluster (or ‘community’) the word is part of. If you click this colored circle, it will only show the words connected to the word you are interested in that are part of the same cluster. To return to all of the words connected to the word of interest, click the coloured circle again.
If you then click on the gray background, you will see all the nodes that belong to this cluster. If you click on the colored circle again, you will remove this filter and return to viewing the whole network.
For the same instructions with images, please look at the section ‘Cluster filtering’.
When you type a word into the Search Bar, the Search Bar searches for that word in all of the labels and translations of the labels. It will then produce a list of all words - displayed in the label and not - that match the search term. You can then choose the term you want, and look only at that word.
In comparison, the Label Filter only searches for the word you have types in the label. It will display all nodes that include your search term - whether or not it is a full word. You therefore need to be more specific with this search, and ensure full words have quotations around them.
In all the networks, regardless of the display language, you can use the Search Bar to find a specific Akkadian word. In the networks whose words are displayed in Akkadian, you can also use the Label Filter function.
In all the networks, regardless of the display language, you can use the Search Bar to find a specific English word. In the networks whose words are displayed in English, you can also use the Label Filter function.
Once you have found a word of interest, you can go to the website Korp, which allows you to make complex searches of the Oracc dataset. This step will allow you to contextualize the patterns found in the networks.
A user guide for Oracc in Korp is available online. You can find it at the metadata website under the heading ‘Documentation’.
You can get to Oracc and the texts the PMI scores are based on through Korp. Korp allows you to make complex searches of the Oracc dataset. This step will allow you to contextualize the patterns found in the networks.
A user guide for Oracc in Korp is available online. You can find it at the metadata website under the heading ‘Documentation’.
Once you have the search results in Korp in the KWIC format, you select a search result. A panel will appear on the right side of the screen. When you scroll to the bottom of this panel there is an entry called “link to Oracc:”, followed by a hyperlink. The link will take you to the text and line of the Korp search result as displayed in Oracc.
For example, clicking the ‘Search in Korp’ link for “messenger” (našparu) results in a KWIC search list of 14 results. Clicking a word in the first result line brings up the sidebar to the right. Then, the “link to Oracc:” hyperlink takes you to the text the search result came from.
By downloading the network you will be downloading the entire network prior to any filtering you have done in the portal. When you click ‘Download’ you will see a pop-up asking where you want to save the network. It will be saved as a .gexf file, which is easily opened in Gephi.
Note: when you open the network file in Gephi, it may show you some warnings. These only refer to the colours in the network, and so it is safe to continue.
When you open the network in Gephi you will see that it has included the following: weighted degree; degree; modularity class; a search url for the word in Korp; a link to the word’s ego network in the lexical portal; a translation of the word (if this was an English graph, it will be in Akkadian, and vice versa).
The same information can be found in the User Guide section ‘Download’.
Gephi is a visualization software used in generating networks. It is free to use, and has many in-built Network Analysis tools. Furthermore, there are many plug-ins available for more complex usages of Network Analysis.