-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redesigning The Hindu Data Point Stories (2020) #4
Comments
-----Work in progress----- Original Article by The Hindu Facts Claims Original Visualization:Problems with the visualization:
Ideas:
Re-designed visualization:
|
Good effort Abhijit. |
-WIP- How many GI's Does your state have? The article is an educational article with intent of communicating all GI's in the country. The visualization in article tries to normalize GI's for each state against the area. It explains the concept of GI - a sign used on products that have a specific geographical origin and possess qualities or a reputation that are due to that origin. These products are split into 5 categories -
Original Visualization by The Hindu Issues with existing visualization -
Visualization 2 & 3
Scope for redesign Attempt 1 I started with trying to plot all GI's on India map in a symbol chart to see if any natural clusters emerged. I also added color and number of GI's as dimension to help users identify clusters. However, post collecting data, GI's seemed to be spread sufficiently across country with exception of some states. Context of types of GI's was also lost.
Attempt 2 To Do -
|
Where India's mobile Internet speed ranks globally, which operator offers the fastest download speeds, and moreclick here to access article Class Discussion:
Redesign: Instead of providing a list of all the countries with respective internet speeds which might overload the viewers with extra information, a color coded world map by labeling important/required countries which gives the viewer a basic understanding of Internet speeds across the world could be more efficient. Author has emphasized more on the 11 countries which are mentioned in the table ( top 3, BRICS nations and last 3). Instead, representing them on a bar graph, arranged in Rank wise, would be easier for the viewer to compare the internet speed between the countries just by looking at the height of the respective bars. I though, the percentage of the population who subscribed to mobile internet in a specific country could have an impact on the internet speed in that respective country. The relationship between the internet speed and the number of mobile internet subscribers could be an interesting visualization which might give more insights. But I couldn’t find any significant relation between these two attributes. As we measure internet speed in bandwidth, I wanted to make the visualization look like a band. That is the reason why I have chosen stream graphs to visualize how the number of internet users changed over time. Since it started from 0 and gradually increased, the visualization may not look like a band as I expected it to be. The bar graph represents the internet speeds of 4 major operators in India along with their position globally. In the article, the performance of various operators in various states of india was represented in a tabular format. But, the small multiples can be used to visualise how various operators are preferforming across India in a better way. But if the values are populated on the map, it might be overloaded for the viewer, which made me consider Sunburst graph to represent the quantitative data. In the Sunburst graph, I have color coded the 4 different operators, the state gets the same color of the operator who provides the fastest internet. The speed of the internet is also color coded, darkest color being the fastest and the lightest color being the slowest. |
Initial study: The article.The article that I picked up was Hunting in pairs: a look at the best bowling partnerships in Test cricket. The article was written on the occasion of English pace bowler Stuart Broad becoming the seventh bowler and second Englishman, after pacer James Anderson, to pick up 500 Test wickets. The article discusses the the best performers in 3 areas.
1. Bowler pairs with 500 wickets or more.According to the article:
2. Best combined strike rate (for minimum 200 wickets).
3. Bowling pairs with most wickets / game (for minimum 200 wickets).
Solution: An alternative lens.From the data that I had, and the calculated fields, it was possible to extract the following columns / attributes.
Iteration 1.In my first iteration, I decided to pick the bowling partnerships that were mentioned in the article. For me, they represented the "Best performers" in various categories. I also included the best performing partnership of Muttiah Muralitharan, for the reason that he had been the highest wicket taker in test matches itself. I decided to keep them all in one chart and see where they lie with their stats. I started with placing their information in a table. Next, I highlighted the boxes that displayed the main reason for the pair being in the table itself. (The reasons / categories of their best performance.) Iteration 2.In my second iteration, I took the players mentioned above and tried visualising them on the bases of the wickets taken and their combined strike rate. Iteration 3.In the third iteration, I added the player country to the labels, and also included the wickets taken per game as another attribute. Iteration 4.From the previous iterations, I understood it is certainly difficult to represent the data for sports persons, in this case, cricket. The graphic that I created does display the information, but does it essentially allow the reader to understand it clearly? From the feedback given to me, I realised that instead of showing the information about the pairs highlighted in the article, I could just take the data available for the pairs that excelled in one particular way. For example, the pairs that took 500 wickets or more. Also, instead of visualising the information via a graphic, I could highlight a few things in the table itself (as I had attempted to do in iteration 1). The data I had for the pairs with 500 wickets and more: The table after selecting important attributes and rounding off values: I highlighted the highest and lowest values from the table. Iteration 5.After a few more adjustments, I looked for insights from table and came up with this infographic. |
-WIP- The Story The article takes a look at the data between 2001 and 2018, on the custodial deaths and the number of policemen convicted in those cases. While calls for a fair probe are growing, differences in these numbers are alarming. Most of these deaths were attributed to reasons other than custodial torture, such as suicide and death in hospitals during treatment. The article puts its focus on five states including Tamil Nadu, where the father and son died while in custody. The data shows that there were no police convictions between 2001 and 2018 in these states. The following are the visualizations that came with the article. The idea of creating an impact by showing the alarming difference between the numbers was somehow lost in these visuals. The harmless circular form could have been replaced with a sharp angular graph. Data
Approach Rough sketches Another idea was to show the state-wise number of custodial deaths and the number of policemen convicted over the years, on a scatter plot. The size of the circle represents the numbers. I made an attempt to create a visualization based on the second concept, with the available data. |
What Percentage of people prefer to speak Hindi across States?The Narrative The Data used
*all Hindi speakers - the assumption in the census is that 'All Hindi speakers' can be calculated by summing up the people who use Hindi as their 1st, 2nd and 3rd language. 4th language and beyond are not included. The data for the first three points can be extracted from here Comments on the data used Comments on the Dataset
Visualizations 1. Map titled 'Statewise Split' The map below is a Choropleth that shows the percentage of the population in each state that speaks Hindi. Encoding problem 1: However, there were 2 issues with reducing the bins:
I tried 4 and 5 bins respectively, but they each lead to similarly misleading clubbings. So I decided that 10 bins were best. Encoding problem 2: However, this did not really look like the shape of India, and the class feedback was that it did not make sense to even show this information geographically.
2. Scatter Plot titled 'Native vs Non-native speakers' Encoding: Encoding Problem 1 Encoding Problem 2
3. Scatter plot titled 'An alternative means of communication' Encoding is the same as the previous scatter plot. Problem 1 Problem 2
Tools used: Tableau, Figma and Tilegrams - a good open-source tool for tiled maps. For details on how to make a tilegrams map compatible with Tableau, read this. FINAL OUTPUT For the redesign, I used the same title and mostly the same text from the Hindu Article. I may have edited the text slightly. |
Gender disparity in early educationThe Hindu article chosen is available here. What is the story the author is trying to tell?
Students in private schools performed better in various tasks than those enrolled in government schools and anganwadis, according to the Annual Status of Education Report (Rural) 2019 - This should be backed by the statement in the ASER referring to the method of instruction and not the data of the final results (as by that method of reasoning one could also claim the results are biased because of the gender ratio and the differing capabilities of the genders)
The grouping of age groups and usage of disorganized bar charts to represent part of a whole. Initial Ideation to effectively combine these: Possibility of sing a Spider chart to visualise better -
Explored Spider charts using different scales and parameters - Decided to go ahead with a scale from 70 to 100 percentage for the completion of a particular activity. Chose six activities, two from each cognitive, language and numerical skills
Final Visualization Spider Chart for the difference in performance ie. boys performing better than girls, pie chart for the reason of that difference ie. differing percentage of private and govt school education for girls and boys and line graph for the reason for differing admission to private and govt schools ie. education of mother Comments - Could create 3 spider charts to show in greater detail the performance in all 3 subjects of cognitive, language and numerical skills. Data is available in ASER |
How have climbers fared in the test of Mount Everest?The original article can be found here What story is the article trying to tell? The next part also shows a visualization of the main causes of death using a tree diagram(figure 2.) but does not give any valuable information. The article also shows the data in terms of countries(figure 3) and also between genders (figure 4). In figure 3 (Source:Hindu) the rate of success is plotted and with the number of failures and successes. Nepal is obviously leading because of its proximity to Everest. Russia with very few attempts has a very high success rate. This chart was one of the better ones out of all of them. --REDESIGN--First Attempt: I made another visualization of the causes of death on the mountain. The y-axis is the height at which death occurred, the x-axis is across time and the size of the shapes show whether they summited or not. Similar articles have mentioned that people have been more successful through the years but the death rate has not changed much hovering under 1%. We have better equipment now to climb the mountain and also to predict the weather. So why is the death rate not decreasing even more? As I analyzed the causes of death I decided to bin a few categories and divided them into Internal and External. "Internal" being caused to sickness and illness and "external" caused by harsh weather conditions, falls, and avalanches. There were some categories like unknown or disappearance for which I used pink and black respectively. The visualization shows that more recently people have been dying due to internal factors rather than external factors. This could be due to the fact that climbing Mt. Everest has become more of a tourist attraction where anyone can pay to climb the mountain without much proper training. The big circles show the people who died caused by an avalanche in 2014. Feedback: I had to do the binning within the legend also, There was some difficulty in recognizing whether shapes overlapped or not. Maybe use two different charts for summit and non-summit. I tried using different charts for those who summited and not summited but the focus of the story changed. I wanted to highlight the type of death on the mountain. I used 4 bins, 2 that are similar, Fall(Darker blue) could be caused by faulty technique and equipment but avalanches(Lighter Blue) were unpredictable. I combined disappearance with unknown and other as black.
|
Where does your state stand on the India Innovation Index?Original Article by Varun Krishnan, here.Introduction to the topic:Indian Innovation Index examines innovation capabilities and performance of the Indian States and UT's. It is measured as an average of Enablers( innovation inputs) and Performance( innovation outputs).About the Article:This article answers two questions:
Story of the article:The writer tries to bring a bigger picture first. He talks about the global innovation index and its comparison with other emerging nations. And then a more focused comparison is made among the states of India based on enablers( innovation inputs) and performance( innovation output) providing grounds to state ranks in the Indian innovation index.Observed structure of the content:
As per the article points 1 and 2 above forms the secondary information and 3 and 4 together form the primary information. Observed objectives of the Data visualisation in the article:
Classification based on Intent:
Data:Source for Global Innovation Index is here and the source for Indian Innovation Index is here. The required data were extracted and cleaned for the purpose of use in this project.Types of data-
Identified Problems:The data visualizations of the article are shown in Figures 1, 2, and 3. The identified problems are listed below.
Redesign:Ideations:Final Visualisations |
70 Years of Pending Cases in INDIAOriginal article: 77 cases filed in the 1950s still pending in courts across India. Link Story of article: The article also highlights how much the pending cases have increased significantly since 2010. "Out of the nearly 3 crore cases pending, 2.6 crore were filed after 2010" the article mentioned At last, the Pending cases in Uttar Pradesh which seem to be significantly high as compared to other states. "Nearly one in every four pending cases across the country are from Uttar Pradesh (73.1 lakh)", the article mentioned. *Only available data in the visualized form was the following table **Focus of my visualization was **
10 states with the largest Number of pending case Traveling back to the pending cases - View link for Interactive prototype Highlighting the oldest pending cases till date The size of the bubble here represents the total number of pending cases in that particular state till date. Increase in pending cases since 2010 Tool used : Flourish and Tableau Desktop |
What is the share of death sentences among sexual offence cases?Link to the article here. the data presented in the Hindu article in tabular form & as percentages, Colour (saturation) used to indicate the value of the percentages. Intial redesign idea, as a stacked bar graph |
How Has the State of Democracy in India Changed Since 2008?Original Story: Data | How has the state of democracy in India changed since 2008? The Narrative of the Data StoryThe article attempts to illustrate the change in India's democracy since 2008, particularly highlighting India's decline on the index, especially in certain parameters such as Civil Liberties. The data story is told exclusively with the help of three tables and accompanying text, using color hues and saturation in the table to encode improvements or declines and countries better than or worse than India. For example, the fewer the number of countries doing worse than India, deeper the saturation of red to imply that does not bode well for us. Table 1: India's Scores In 2019 Table 2: Change in Scores since 2014 Table 3: Comparison of Changes in 2008-2014 and 2014-2019 Comments Current Data Visualization and Story:
Comments on the Dataset
Initial SketchesMy initial ideas addressed the lack of representation of ordinal data of the countries ranks and the absence of anchor countries in the spectrum that would give readers an idea of what the Democracy Index Scores mean compared to good democracies and authoritarian regimes. The second visualization, for example, would allow people to compare India's progress over the years against the US giving them an idea of how well or poorly we have performed against a democracy similar to ours. I did not go with the third visualization of rank changes since accommodating such a large visualization of changes in ranks of so many different countries would draw too much focus towards itself and take away from the primary focus on India's performance through the years. Instead, I chose to focus on the first two ideas and also include India's trends along different parameters over the years (and not just a simple trend line visualization of their overall Index score.) Data CompilationI compiled data firstly of the ranks of India and countries such as Norway, Germany, China, etc. in 2019. Then I also scoured the available indices through the years for the scores of India and the US across the different parameters for the second part of the visualization. Reasons for comparison against the US:
Performance of India and USA through the Years Categorization Data for the Visualization I then generated graphs of India and USA's performance through the years with Datawrapper. These were exported as PNG and used as an underlay to trace over in Illustrator. Illustrator provided more control over highlights, annotations and font sizing for better readability in the final design of the graphs. One such example of a simple graph from Datawrapper is below. First IterationIn the first iteration, I represented India on a continuum of dots that represented the ordinal rank data of the countries and arranged the graphs of India and US's trends vertically below that. The problem with this visualization was that using circular dots to represent all 167 countries made legibility an issue and the vertical arrangement of graphs made the second component of the visualization difficult to read. Final VisualizationThe final visualization fixes the issues pertaining to the exclusion of other countries that would have given readers anchors for the scoring out of 10 and contextualized the ranks within the different categories of regimes. Another major change was the inclusion of annotations, especially in changes of power between the INC and NDA. The visualization is divided into two components:
Design Changes in First Component (Rank Data)
Design Changes in Second Component (Trend Comparison)
Final ThoughtsThe visualization was an insightful one though I have tried to be a little more direct between the INC/NDA split, which can also be gleaned from the visualization which shows that we have done significantly poorly since 2014. While the original author may have chosen to remain apolitical by not highlighting this contrast, I have chosen not to. Lastly, this visualization may be made even better as an interactive visualization in which individual points in the trend graphs could have popups of key events instead of annotations as they currently do. Similarly, the users could hover over the ranks and see which countries lie at the selected rank. A higher resolution version of the visualization may be accessed here. |
For this assignment, we'll use data stories from The Hindu Data Point.
Select a story that you like, study it carefully and redesign it. Specifically I want you to focus on understanding the data that powers the story, and how it is visually encoded to tell the intended story. Document your design process, capturing the following:
What is the story the author is trying to tell?
What the data he/she is using to tell the story? Describe its details -- type of data, extent of the data, dimensions of the data, gaps in the data, what data is essential and what is irrelevant.
How is it encoded, problems with it and how you attempted to improve it.
You may choose to expand or curtail the scope of the data used in the story, or add an additional dataset to tell the story better. But do not deviate from the main intent of the original story. In other words, it is a redesign exercise, and hence I do not want you tell a different, unrelated story.
While you should provide a link to the original story, it might be useful to capture and display inline, appropriate parts of the original visualization, and your own design iterations to produce a coherent documentation.
For reference, take a look at what the previous batch did with this assignment.
The text was updated successfully, but these errors were encountered: