Ethical Issues in Data Mining

The important ethical issue with data mining is that, if someone is not aware that the information/ knowledge is being collected or of how it will be used, he/she has no opportunity to consent or withhold consent for its collection and use.

Web Data Mining

  • Ref: Ethical issues in web data mining
  • definition: Web mining refers to the whole of data mining and related techniques that are used to automatically discover and extract information from web documents and services. When used in a business context and applied to some type of personal data, it helps companies to build detailed customer profiles, and gain marketing intelligence. Web mining does, however, pose a threat to some important ethical values like privacy and individuality. Web mining makes it difficult for an individual to autonomously control the unveiling and dissemination of data about his/her private life.
  • Types of web mining
    • Content mining, e.g., tweets, images, videos, ...
    • Structure mining
    • Usage mining
  • Web content and structure mining is a cause for concern when data published on the web in a certain context is mined and combined with other data for use in a totally different context.
  • Web usage mining raises privacy concerns when web users are traced, and their actions are analysed without their knowledge.
  • Both types of web mining are often used to create customer files with a strong tendency of judging and treating people on the basis of group characteristics instead of on their own individual characteristics and merits (referred to as de-individualisation).

Social Media Data

Genomic Data Mining

  • Rethinking the ethical principles of genomic medicine services "We argue that public genomic datasets carry substantial societal benefits, and that the collective nature of these initiatives means that those patients who benefit from genome sequencing have an ethical obligation to share their health information, an obligation grounded in considerations of fairness. We argue that in order to maximise the benefits of genomic services, the storage and use of genomic data for the advancement of medical knowledge should be permitted without explicit and specific consent, and that international and other bodies should be granted access to these data, provided certain conditions are satisfied (Box 1). While the considerations here are largely specific to the NHS, they may form an exemplar for further international work.."
    Proposed conditions to routine collection, storage and use of data
    • Genomic dataset should be appropriately secure and consist of de-identified data, with genomic and phenotypic data not linked to personal information
    • ...