Latest Official News Collector related to covid-19

image

01 April 2020

Essabrari Hafsa1Nahiz Oussama2


THE PROBLEM OF FAKE NEWS

Fake news refers to viral posts that intentionally disseminate misinformation to mislead readers and promote particular ideas. Fake news is as old as the media industry itself. With the emergence of new technologies, a growing number of fake news has been recorded. Moreover, online social media has offered wide opportunities and spaces for non-journalists to be involved in journalistic activities, (e.g: Produce journalistic articles, or news), which has led to the advent of the citizen journalism concept. The lack of expertise of citizen journalists induced the rapid and wide spread of false information, which has resulted in many adverse consequences. For instance, fake news convinces users to accept biased information and influences the way individuals used to react to real news. Fake news may cause a negative impact on society, it creates social unrest and social stress. More particularly, during pandemic times, the spread of fake news can exacerbate the social damage. 


NEWS COLLECTOR: A PARTICULAR WAY TO FIGHT FAKE NEWS

News collector is an automatic search engine that collects news from official sources. In this period of time, covid-19 official news are provided by different institutions and official media sources such as the Ministry of Health, the Ministry of Interior, and MAP.

The collected information is published in a dedicated webpage http://mobadarat.ma/news.php where content is updated periodically.

 

The news collector tool has been developed based on the following process:

  • Identify news sources:  We identified, as a start, to work with two news sources: The ministry of Health and the Map Anti Corona portal, 

  • Analyze the structure of every page of the trusted sources by a meticulous reading of the HTML code. The goal is to find a standard structure to retrieve the news information from the pages

  • Build a scraper of the pages using nodejs, axios and cheerio technologies

  • Run the scraper periodically to collect the latest news

  • Publish the site under http://mobadarat.ma/news.php

  • Publish the source code under this repository.

  • If you identify a news trusted news source, please let us know by email.



ENCOUNTERED DIFFICULTIES

Some difficulties include:

  • Some pages are protected therefore we needed to change the user-agent in order to get access 

  • the news articles do not have a clear timestamp (exact date and time).

  • News articles are published in a closed format (e.g. closed pdf). This makes the automatic reading of the content almost impossible. For example, the articles published in the department of Administration Reform


FUTURE POTENTIAL IMPROVEMENTS

As this work is still at its infancy, we encourage interested volunteers to support this effort. 

Urgent actions include: 

  • Add more news sources

  • Provide an API to enable reuse of these news by other content consumers


Do you want to help us? Contact us


1Student at 1337,essabrarihafsa@gmail.com

2Freelance developer &&  Student at 1337, Useit015@gmail.com