-
Notifications
You must be signed in to change notification settings - Fork 0
/
introduction.tex
30 lines (24 loc) · 5.14 KB
/
introduction.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Within the Horizon 2020 project EUHubs4Data \cite{EUH4D} data experimentation projects
with small and medium enterprises (SMEs) were
set up to show the case and accelerate data innovation. We ourselves
were funded by the EUHubs4Data project which consisted of 42
``experiments''. Experimentation in this sense
meant that a controlled environment was created between a group of partners to form a federation of data-driven innovation hubs
that established a cross-sectoral and cross-border data space.
This space emulated a diverse ecosystem, as we might see in future data spaces currently established across the European Union as recently defined by the Data Act \cite{}, which became applicable law on 11 January 2024, just as the project is finalized.
An important aspect of such experimentation is to validate concepts and assumptions, but also to discover unknown and sometimes adverse effects.
Such adverse effects around the innovative use of data often relate to ethical issues. The UN Human Rights Council already affirmed in its resolution on the promotion and protection of human rights on the Internet, in July 2012, 'that the same rights that people have offline must also be protected online, in particular freedom of expression'. Since then things have developed rapidly, and particularly the free flow of data was enabled by the Internet. Recently, particularly the use of artificial intelligence, which also in our project played the most prominent role in data innovation, became a major concern towards human rights. The AI Act \cite{AIAct}, which is expected to become European legislation within 2024 even mandated human rights impact assessments for using AI models in certain "high-risk" applications. Many of the impacts of the novel use of data to derive knowledge or autonomous agents are unknown.
The experiments, which we will reflect on in this whitepaper, were led by innovative SMEs that were
independently selected in the so-called `open calls. They were supported by a number predefined ecosystem members (so-called data innovation hubs or i-Spaces
in our case) that were directly funded to provide an infrastructure for
experimentation. Typically, this setting emulates a market situation
using a public offering. However, being carried out within a research
and innovation project, the situation differs because SMEs can use
public funding to cover both the cost of the offered and their own
work.
In such a setting, the ethical impact is a big unknown at the start. So we set out to monitor the experiments and the overall project \cite{D3.1}. Finally, one of the main distinguishing aspects of the upcoming European economy is supposed to be data and AI ethics \cite{some digital strategy}. After three years and having
monitored 42 very different experiments all around data, it is time to review and reflect on our learnings, on the state of our ecosystem (which we try in the first section). One of the findings that we will present is that, after five years of the General Data Protection Regulation, SMEs still have difficulties achieving basic levels of compliance when dealing with large amounts of data. From what we have seen throughout the project, we see the importance of practical frameworks and support structures, which will become even more important when implementing trustworthy AI. This is why wee in the second part of this whitepaper will argue for the need of further experimentation as a form of exercise to actually live by the standards we set ourselves. Based on our own experiences, we see the need to share our tools that practically enable ethics,
but also basic legal and contractual compliance, as SMEs are increasingly overwhelmed by demands imposed on them.
\section{42 experiments under the ethical looking glas}\label{}
After reviewing a set of ten first projects, we already published a first report on our findings as a public report. In this report \cite{DXX}, we have already summarized many of the challenges. In this whitepaper and 32 experiments later, we would like to briefly summarize the areas that appeared critical during our ethics monitoring.
As a methodology, we reviewed all 42 project at start, midterm, and after finalization. We did this as a group of technical experts with a lot of practical experience in the area of data innovation, who was advised by a legal expert. As we started, we had little tools and were expecting to focus on rather theoretical edge cases of unintended use of data and AI models. However, as we discovered, and this is probably the major finding of our extensive experimentation, the majority of applications are touching real hazards around data. In most of those cases, the biggest problem was that it remained unclear from the initial documentation if real ethical risks could be foreseen. As a dynamic of any funding application, one can easily imagine the positive impacts and the foreseen scaling of data processing. What can be said in general: The negative impact and adversarial effects foreseen were clearly disproportionate for most projects at the start of experimentation. We believe that ethics by design and default first of all requires awareness of such hazards, which we tried to provide as feedback to the experiments.