Download PDF version of this article PDF

Toward Effective AI Support for Developers

A survey of desires and concerns

Mansi Khemka and Brian Houck

Years of software engineering and product development have taught us that the best way to build products that delight customers is to talk to customers. Talking to actual customers provides important insights into their challenges, and their loves. This leads to innovative and creative ways of solving problems (without creating new ones) and guards against ruining workflows the customers are already delighted with.

And yet... the emergence of AI has many leaders forgetting these lessons in a rush to create new AI-driven development tools, often without consulting actual developers. Our research is meant to help close that gap and give companies, product teams, and fellow practitioners insights into the opportunities and concerns that developers have with using AI in their work. Armed with this information, product teams and leaders can make better product decisions and communicate more effectively about the changes happening around them.

Much of the existing literature focuses on the impact and efficacy of AI-driven development tools—such as GitHub Copilot, powered by OpenAI's Codex3—from a performance-centric perspective, such as the relevance of the code generated by GitHub Copilot11,12 or the perceived increase in developer productivity.9 While some research explores how developers engage with such tools,1,8 the scope is limited. Our approach seeks to invert the lens and prioritize the voices of the developers.

While these tools and studies have merit, there is a need to understand what the developers want instead of what we think they want. The workflow of developers is multifaceted. Their responsibilities range from application development—planning, building, testing, etc.—to tasks such as managing communications with team members and searching for career development opportunities. Thus, we conducted a survey that focuses on directly interacting with the developers, from new hires to seasoned professionals. The survey questions give insight into developers' perspectives on how they view AI, how they want to use it, and what their top concerns are in adopting it.

While some of the survey results adhere to current speculations, some show a deviation from expectations. But acknowledging them and accounting for them in R&D will make AI adoption guided instead of speculative. This approach also provides insight into why some teams may be struggling to drive adoption of AI tools within their organizations.

The first section of this article outlines the details of the survey. The second section discusses some of the areas of AI that excite developers the most. The third talks about their concerns. The article ends with a discussion about what organizations and leaders can do to address these concerns.

 

Methodology

We conducted our survey from April 4–14, 2023, with the aim of gathering the perspectives of software developers in the realm of AI. The survey was designed to answer two questions:

• What aspects of their job would developers be most excited about AI helping with?

• What worries developers the most about integrating AI into their workflows?

Following is the methodology used to conduct and analyze the survey:

1. Survey platform. The survey was conducted using Microsoft Forms, an online platform that facilitates the creation of shareable forms that are suitable for capturing respondent feedback.

2. Sample selection. From a pool of 3,000 randomly chosen invitees, a total of 791 responses were garnered (a 26 percent response rate). While the selection process was primarily aimed at software developers, a marginal number of software development leads and program managers were inadvertently included as a result of the target algorithm. Those 54 responses were excluded from the results.

3. Demographics. All respondents were employees of Microsoft, specifically the Cloud + AI division. To ensure unbiased results, members from the Developer Division team, which actively contributes to AI-enabled development tools, were excluded. Furthermore, the survey focused solely on U.S. employees, explicitly excluding partner-level and above developers.

4. Survey structure. The survey consisted of nine questions:

• Seven core questions related to AI, plus two supplementary questions inquiring whether participants wished to receive the survey results and wanted to enter a sweepstakes.

• Five of the core questions followed a selection-list format, while the remaining two were open-text questions, giving respondents the freedom to articulate their views.

• The average response time was 10 minutes.

• Selection-list questions were displayed in randomized order for each participant, who were allowed between one and three selections. This forced participants to prioritize their responses and prevented situations where they would pick most or all answers.

5. Incentivies. To encourage participation, respondents were presented with an opportunity to win one of 50 gift cards, each worth $50. The winners of these gift cards were selected through a random sweepstakes drawing.

Table 1 lists the survey items, excluding the questions about eligibility or entering the optional sweepstakes.

Toward Effective AI Support for Developers
Toward Effective AI Support for Developers

 

What aspects of their work would developers be most excited about AI helping them with?

This question offered insights about how to prioritize AI product features to meet the needs of developers. Responses were spread across 17 distinct options to choose from (plus an "Other" option). Most developers (96 percent) chose a core development activity as at least one of their top three. Options meant to reduce bureaucratic toil (for example, parsing email or tracking and managing work items) were chosen by 37 percent of developers, and 25 percent chose a well-being activity (for example, helping to relax and/or reduce stress) as one of their top three. It should be noted that newly hired developers (those in a role for less than six months) were more likely to choose a well-being area as one of their top three (by 13 percentage points).

The next section discusses the top findings in detail. For each finding, we have cherry-picked open text responses that were representative of the key themes heard from developers. Figure 1 shows the options provided for the first of the two questions and the percentage of respondents who selected each one.

Toward Effective AI Support for Developers

 

Notice the sum is greater than 100 percent. This was expected because a respondent could select from one to three options.

 

Generating Unit, Integration, and Functional Tests

Selected by 44 percent of respondents

Software testing is a critical part of software development that ensures the reliability and performance of applications and can help prevent costly defects in production. Testing can validate that the product meets its requirements and provide stakeholders with confidence in the quality of the product. While the value of testing may be high, it is often a challenging activity that many developers may find less exciting than core feature work. Moreover, what was once considered a separate role altogether (and is still considered a separate role in some organizations) has now been minimized as another one of core development tasks of a software developer. Essentially, developers are now doing what was once two jobs.

"Unit testing is a monotonous process. It would be great if AI can auto generate these cases."

It is unsurprising, then, that the top task that developers responded they are excited to delegate to AI-powered tools was writing tests. This would not just alleviate the "monotony" but could also result in higher-quality tests and, by extension, a higher-quality product. Anything that AI can do to ease the burdens of testing can improve both DevEx (developer experience) and customer outcomes.

"The time it takes to develop a feature is about the same as the time it takes to test a newly developed feature, and sometimes even more time is spent on testing. We need to come up with more comprehensive test cases, but occasionally we still miss some corner cases. Moreover, most of the testing code is regular and predictable, so I think AI can help complete the testing code after the feature code is finished."

 

Analyzing code for defects, vulnerabilities, or optimizations

Selected by 42 percent of respondents

Writing unit, integration, and functional tests primarily focuses on validating the functional correctness and expected behavior of software. Analyzing code for defects, vulnerabilities, or optimizations, however, examines code in aspects of security and performance characteristics. Both activities are not only complementary but also repetitive, making it unsurprising to see these two areas mentioned by a similar number of respondents.

Because of vast historical context, AI tools are well positioned to help in the detection and mitigation of code vulnerabilities, and thus might give developers more confidence in their capabilities. Identifying runtime errors such as null pointer exceptions or security vulnerabilities such as buffer overflows are all patterns that AI theoretically could identify and "shift-left" into the code-writing portion of the development workflow.

"I think AI is good at doing closed-loop tasks and surfacing common insights from data it has seen. For example, code vulnerabilities and optimizations have been written and discussed online for decades, and they are plentiful in every codebase. I trust AI recommendations there, although I would thoroughly review any suggested changes "

Pair programming has been a way for developers to get assistance in finding optimizations and reducing defects in their code. The age of hybrid work has made such activities more challenging, however, and some developers acknowledged that AI can help fill that gap.

"While writing the code, if someone [AI] tells me of potential issues with the code, that would be like pair programming with someone."

Enabling code-level review and improvement prior to sharing it with their peers for a human review is something developers can envision using AI's help on.

"It would be absolutely awesome to have AI check code prior to asking peers for code reviews."

 

Writing documentation

Selected by 37 percent of respondents

Typically, 60–70 percent of the SDLC (software development life cycle) is spent on maintaining the software.5 When a developer revisits a code snippet, documentation plays a key role in understanding the code, design decisions, and its usage. As important as it is, this is a cumbersome process and often ignored altogether. In addition, with multiple developers often working on a codebase, it changes at a fast pace, thereby requiring a similar pace of update for the documentation as well. This creates tension between developers' need for documentation and their dislike of creating it.

"As a developer, we hate having no document to help us understand the code better; at the same time, as developers ourselves, we hate to write documentation. If AI tools can help solve this mystery, that will help anyone :)"

Note that automated documentation of code requires context awareness, which makes developers skeptical of completely relying on AI without human proofreading. This point might help in standardizing guidelines for whether documentation should be generated when code is written or when it is consumed/updated.

"If AI is able to create documentation from previously written code, that would make onboarding onto unfamiliar code a lot easier. I see AI doing the 'busy work' with documentation writing, but software engineers still need to read through the generated documentation and make edits to ensure accuracy."

 

Performing root-cause analysis for bugs and incidents

Selected by 31 percent of respondents

RCA (root-cause analysis) is integral to improving software quality because it embodies the principle of learning from errors to prevent future occurrences. The integration of AI into RCAs can reduce the burden on developers by getting them started on their way to uncovering the root issues. It can automate data analysis, pattern recognition, and fault localization, enabling developers to concentrate on strategic problem-solving.

"I would like tools to auto-analyze and identify the root cause of [service incidents] or at least get me close. Run queries, find where the faults happened, add that to the ticket."

Incorporating AI into RCAs aligns with the modern developer's dual mandate: to create and maintain robust software. While AI's role in RCAs is an exciting prospect, it should be implemented with the understanding that it supplements, rather than replaces, the nuanced judgment of experienced developers.

 

Writing new code and/or refactoring existing code

Selected by 25 percent of respondents

Writing new code often equates to a sense of accomplishment for developers, making them feel both productive2 and satisfied with their day's contributions.4,6 This act of creation not only is pivotal for their personal growth and mastery, but also presents opportunities to innovate, experiment with new technologies, and address unique challenges.

Survey respondents said that they imagine AI tools such as GitHub Copilot could enhance the code-writing experience by reducing the time spent on boilerplate code, making it easier to interact with unfamiliar APIs, and even changing the entire developer experience of writing code.

Reducing boilerplate: "It's fantastic for boilerplate, or roughly sketching out a framework."

Learning how to use new APIs: "I often have to look up usage of APIs that I am not familiar with. I do not want to have to sift through [documentation]."

Changing the developer experience: "Instead of writing code, I should be able to talk to the AI using voice, describe what I want."

When describing the ideal experience with AI-assisted coding tools, some developers talked about wanting to focus on the big picture versus having to deal with all the syntactical details of implementation.

"I envision this to be as intuitive and seamless as if I had a person next to me say, 'Hey, you missed something; you should consider doing this...' It would be great to be able to focus on the bigger picture of making code changes and letting my 'copilot' worry about the implementation details."

AI-driven development tools are poised to revolutionize the code-writing experience, making the development process not only more efficient but also more intuitive, allowing developers to channel their focus on overarching vision and innovation.

 

Key takeaways from how developers want to use AI

The survey revealed a hierarchy of developer expectations for AI integration, reflecting a desire for AI to tackle tasks that range from repetitive to complex.

1. Automating routine tasks. The majority of developers (96 percent) anticipated that AI will alleviate the tedium of routine tasks such as generating tests and documentation. These tasks, while essential, are often seen as monotonous and distract from the more creative aspects of development. AI's potential to enhance these areas could significantly boost DevEx and product quality.

2. Streamlining administrative duties. A significant portion (37 percent) hoped AI can simplify administrative overhead such as email parsing and task management. These duties, while not core to development, consume substantial time and are ripe for AI's organizational capabilities.

3. Enhancing well-being and efficiency. Fewer developers (25 percent) prioritized AI for well-being activities such as stress reduction, indicating a preference for AI to focus on enhancing job efficiency over personal management tasks.

Note that the percentage of each category of developers, bucketed using their number of years of experience, had roughly the same percentage of selection. That is, a seasoned professional was as likely to select "Generating unit tests" as a new hire. The exceptions were: "evaluating performance to help identify areas of growth," for which almost 20 percent of new hires expressed excitement versus only 4–5 percent in other categories; and "clarifying requirements," for which roughly 9–11 percent new in their career showed excitement versus a mere 1–6 percent in more senior categories.

"AI should feel like pair programming—sort of like Ironman and Jarvis. Instead of writing code, I should be able to talk to the AI using voice, describe what I want, and the AI should be able to write the code, analyze it, optimize it, help me test and debug it, and then help me create and manage the PRs to get the code into the repo. My job should be on creativity and NOT limited by how fast I can type Make us more productive so that we can deliver better features faster, giving us an edge on our competition."

 

What worries developers the most about integrating AI into their workflows?

The massive pace of growth and adoption of AI in software development has garnered significant attention. The spectrum of promises to automate away many mundane tasks has been acknowledged as truly transformative. This goes hand in hand, however, with the skepticism that many harbor toward it.

"I'm curious to see how it all plays out. I'm sensing some personal burnout of the 'AI to help you x' of everything that seems to have swept the company. The hype is very high. In what is typical of a giant company, we will build some good uses and some bad uses for AI; hopefully the bad use cases don't turn off users."

To gain an understanding of the primary concerns developers hold regarding the integration of AI into their daily routines, they were presented with a selection of possible apprehensions and asked to identify the one they found most significant. The following section explores a selection of these concerns, handpicked for a more detailed discussion. Like the previous section, each concern is supported by cherry-picked open-text responses. Figure 2 shows the list of concerns we provided for question 2, along with the corresponding percentage of respondents who endorsed each one.

Toward Effective AI Support for Developers

 

Being more gimmicky than helpful

Selected by 29 percent of respondents

The skepticism around AI in software development often stems from the perception that AI's current abilities are overstated or fail to deliver on their promises. While AI is impressive in demos, developers are concerned that AI tools might not be able to handle the complexity and variety of real-world programming.

"To me it seems more like a gimmick. In order for me to let something external manage my calendar, prioritizing tasks, or refactoring code that is used in production services impacting millions of our customers, it would need to prove itself over a long period of time, and that at the very least seems too far away in the future."

This skepticism may explain why some organizations have struggled to convince their developers to adopt the AI tools that are being made available to them. Addressing developers' skepticism requires a multifaceted approach. AI tool developers need to ensure that their offerings are not just technologically advanced but also transparent and understandable. This can be achieved by accompanying AI tools with documentation and explanatory frameworks or integration of explainable AI10 in the tools. By demonstrating consistent and reliable performance in diverse real-world scenarios over time, these tools (and toolmakers) can gradually earn the trust of developers. This openness not only builds credibility but also empowers engineers to engage critically with AI recommendations, much as they would with advice from human colleagues.

"I think it's important in the long run to teach engineers why Copilot is recommending its particular way to do things. I could see engineers getting complacent in not trying to understand how the problem is getting solved. Just like how copy/pasting from stack overflow without analysis can be a bad thing, this could come up with the same result."

 

Introducing defects or vulnerabilities into your work

Selected by 21 percent of respondents

While many fear that AI may be more gimmicky than practical, developers harbor a deeper concern: the risk that AI could actively deteriorate the quality of work by introducing defects or vulnerabilities. This fear extends beyond the frustration of unmet expectations—it encompasses the potential for AI to undermine the integrity and safety of software systems.

"I am worried that AI will create answers that give the appearance of correctness, but are not actually correct."

This sentiment intensifies the skepticism previously highlighted, magnifying the caution developers exercise toward embedding AI into their workflows. It underscores the need for transparency from AI, comprehensive training for emergent tools, and the crucial role of human oversight. Ensuring that AI aids rather than hinders development hinges on this symbiotic relationship between human expertise and artificial intelligence.

 

Automating away your job

Selected by 10 percent of respondents

Some concerns about AI include how it could affect job security, such as job displacement or augmentation. Somewhat surprisingly, only 10 percent of developers in this survey identified job displacement as a concern. Moreover, only 5 percent noted "Less compensation due to automation" as a concern. Still, a section of the developer population expresses unease about the potential for intelligent automation to encroach upon territories traditionally reserved for human expertise.

"I feel like Blockbuster and Netflix. [AI] is about to replace me."

The rapid advancement of AI capabilities is a sign of significant progress, but to some it provokes the idea that their roles could be significantly altered.

"If AI is capable of writing, debugging, and documenting code based on an input prompt of requirements, then will human software engineers be relegated to only on-call and service-engineer tasks? This is the least enjoyable and problem-solving aspect of the job for myself and many others "

This survey shines a light on a pivotal discussion point for leaders and organizations: the apprehension of job displacement due to AI. It's incumbent on leadership to address these concerns proactively, not just through dialogue but by fostering a culture that embraces change. Initiatives such as comprehensive training programs that upskill employees to work alongside AI will be crucial. Leaders also must communicate the value of human insight and creativity—qualities that AI cannot replicate—by ensuring team members understand their evolving, irreplaceable role in a technologically augmented landscape.

 

Increasing bias in workplace

Selected by 5 percent of respondents

Developers also say that AI has the potential to exacerbate bias in the workplace because of its reliance on historical data that may reflect existing biases.

"I am worried about the bias that exists within AI structures that can adversely affect minorities in the software engineering industry. If AI is not designed with everyone in mind, it will end up having bias against people that were not considered during its development, and it is usually unlikely that the software will be fixed afterwards as it is typically deemed too much effort for a smaller population group."

Bias throughout history is irrefutable, so it's important to ensure we are not using AI to generate output based on such bias. To avoid it, there is a need to set up guardrails for including diverse data, improving transparency, and laying out ethics guidelines and fairness metrics to avoid such an outcome. Microsoft's resources on responsible AI7 serve as an example of one organization's approach to this.

 

Other developer concerns regarding AI

This article has discussed a few of the top concerns that developers have regarding AI adoption, although this does not diminish the value of the concerns voted for by the few: for example, concerns regarding AI's impact on the environment. It could be that only 1 percent of the developers selected it as a top point not because of a lack of concern but because of a general lack of awareness or because of the design of the survey, which restricted them to selecting only one option. Leaders and organizations should not only explicitly address the major concerns of developers but also create awareness of these less popular concerns and proactively take steps to alleviate them.

"It is a very powerful tool, there is no doubt about that. The potential is massive. It is important to take a proactive approach toward the risks involved, rather than the reactive approach that is common with such rapidly expanding technologies. At least, a mix of the two."

 

Future of AI in Tech Industry

There is a ton of excitement around AI, as there is a ton of cynicism. This survey highlights both aspects, allowing organization leaders the opportunity to take steps to address them. How can organizations and leaders help address the asks and alleviate the concerns surrounding AI? Some ways in which they can are listed in this section.

 

Education and awareness

There are three aspects of educating employees about AI: (1) How do AI models work? (2) How can AI tools/APIs be employed in existing services/technologies? (3) How do you equip leaders with the skills to guide their teams through the technological transformation?

Mitigating skepticism isn't just about demystifying AI; it's also about preparing leaders to navigate the cultural shifts it brings. Organizing AI bootcamps, promoting online courses, and offering leadership workshops that focus on change management in the context of AI can help build a well-informed and adaptable workforce.

 

Transparency

This again has multiple dimensions: (1) being transparent in the integration of AI in services, where and what parts are integrated, etc.; and (2) leveraging explainable AI,10 as in explaining the results of AI to the developers without merely accepting AI outputs at face value. This allows the developer to reason with AI to make a sound decision to accept or reject the results. This can also help provide feedback to mitigate/suppress biases learned from historical data.

 

Ethical considerations

There are so many facets to this concern that, at first, it can be overwhelming. These can vary from the impact of AI on the environment to the dataset on which it trains. To drive the solution, the first step is to understand and acknowledge the problem. These concerns can be addressed by issuing an ethics guideline for each scenario and establishing an ethics committee that concerns itself with ensuring that the guidelines are followed.

 

Human-in-the-loop

AI being more of a gimmick than a useful aid was reported as a top concern by many respondents. Rather than completely delegating a task to AI or completely delegating it to a human, you can reach a middle ground by employing AI with human-in-the-loop, especially for critical tasks.

 

Conclusion

The journey of integrating AI into the daily lives of software engineers is not without its challenges. Yet, it promises a transformative shift in how developers can translate their creative visions into tangible solutions. As we have seen, AI tools such as GitHub Copilot are already reshaping the code-writing experience, enabling developers to be more productive and to spend more time on creative and complex tasks. The skepticism around AI, from concerns about job security to its real-world efficacy, underscores the need for a balanced approach that prioritizes transparency, education, and ethical considerations. With these efforts, AI has the potential not only to alleviate the burdens of mundane tasks, but also to unlock new horizons of innovation and growth.

 

Acknowledgments

We would like to thank all the study participants and research reviewers for their valuable feedback and insights.

 

References

1. Barke, S., James, M. B., Polikarpova, N. 2023. Grounded Copilot: how programmers interact with code-generating models. Proceedings of the ACM on Programming Languages, 7(OOPSLA1), Article 78, 85–111; https://dl.acm.org/doi/abs/10.1145/3586030.

2. Beller, M., Orgovan, S., Buja, S., Zimmermann, T. 2020. Mind the gap: on the relationship between automatically measured and self-reported productivity. IEEE Software 38(5); https://ieeexplore.ieee.org/document/9311217.

3. Finnie-Ansley, J., Denny, P., Becker, B. A., Luxton-Reilly, A., Prather, J. 2022. The robots are coming: exploring the implications of OpenAI Codex on introductory programming. In Proceedings of the 24th Australasian Computing Education Conference, 10–19; https://dl.acm.org/doi/abs/10.1145/3511861.3511863.

4. Forsgren, N., Storey, M. A., Maddila, C., Zimmermann, T., Houck, B., Butler, J. 2021. The SPACE of developer productivity: there's more to it than you think. acmqueue 19(1), 20–48; https://queue.acm.org/detail.cfm?id=3454124.

5. Gradišnik, M., Beranič, T., Karakatič, S. 2020. Impact of historical software metric changes in predicting future maintainability trends in open-source software development. Applied Sciences 10(13), 4624; https://www.mdpi.com/2076-3417/10/13/4624.

6. Meyer, A., Barr, E., Bird, C., Zimmermann, T. 2019. Today was a good day: the daily life of software developers. IEEE Transactions on Software Engineering 47(5); https://ieeexplore.ieee.org/document/8666786.

7. Microsoft AI. Empowering responsible AI practices. 2024; https://www.microsoft.com/en-us/ai/responsible-ai.

8. Mozannar, H., Bansal, G., Fourney, A., Horvitz, E. 2022. Reading between the lines: modeling user behavior and costs in AI-assisted programming. arXiv:2210.14306; https://arxiv.org/abs/2210.14306.

9. Peng, S., Kalliamvakou, E., Cihon, P., Demirer, M. 2023. The impact of AI on developer productivity: evidence from GitHub Copilot. arXiv:2302.06590; https://arxiv.org/abs/2302.06590.

10. Ribeiro, M. T., Singh, S., Guestrin, C. 2016. Why should I trust you? Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 1135–1144; https://dl.acm.org/doi/10.1145/2939672.2939778.

11. Vaithilingam, P., Zhang, T., Glassman, E. L. 2022. Expectation vs. experience: evaluating the usability of code generation tools powered by large language models. In Extended Abstracts of the Conference on Human Factors in Computing, 1–7; https://dl.acm.org/doi/10.1145/3491101.3519665.

12. Ziegler, A., Kalliamvakou, E., Li, X. A., Rice, A., Rifkin, D., Simister, S., Sittampalam, G., Aftandilian, E. 2022. Productivity assessment of neural code completion. In Proceedings of the 6th ACM SIGPLAN International Symposium on Machine Programming, 21–29; https://dl.acm.org/doi/10.1145/3520312.3534864.

 

Mansi Khemka is a software engineer at Microsoft. Her work helps in setting up guardrails for security and compliance for the various services and employees at Microsoft. She has a rich background in artificial intelligence, built from her master's degree at Columbia University and her past and ongoing research in the domain.

Brian Houck is a principal applied scientist at Microsoft focused on improving the well-being and productivity of Microsoft's internal developers. His work explores not only the technical factors that impact developer productivity, but also cultural, environmental, and organizational factors that impact the daily experiences of engineers. Over the past three years, much of his research has centered on how the shift to remote/hybrid work has impacted developers.

 

Copyright © 2024 held by owner/author. Publication rights licensed to ACM.

acmqueue

Originally published in Queue vol. 22, no. 3
Comment on this article in the ACM Digital Library





More related articles:

Erik Meijer - Virtual Machinations: Using Large Language Models as Neural Computers
We explore how Large Language Models (LLMs) can function not just as databases, but as dynamic, end-user programmable neural computers. The native programming language for this neural computer is a Logic Programming-inspired declarative language that formalizes and externalizes the chain-of-thought reasoning as it might happen inside a large language model.


Divyansh Kaushik, Zachary C. Lipton, Alex John London - Resolving the Human-subjects Status of Machine Learning's Crowdworkers
In recent years, machine learning (ML) has relied heavily on crowdworkers both for building datasets and for addressing research questions requiring human interaction or judgment. The diversity of both the tasks performed and the uses of the resulting data render it difficult to determine when crowdworkers are best thought of as workers versus human subjects. These difficulties are compounded by conflicting policies, with some institutions and researchers regarding all ML crowdworkers as human subjects and others holding that they rarely constitute human subjects. Notably few ML papers involving crowdwork mention IRB oversight, raising the prospect of non-compliance with ethical and regulatory requirements.


Harsh Deokuliar, Raghvinder S. Sangwan, Youakim Badr, Satish M. Srinivasan - Improving Testing of Deep-learning Systems
We used differential testing to generate test data to improve diversity of data points in the test dataset and then used mutation testing to check the quality of the test data in terms of diversity. Combining differential and mutation testing in this fashion improves mutation score, a test data quality metric, indicating overall improvement in testing effectiveness and quality of the test data when testing deep learning systems.


Alvaro Videla - Echoes of Intelligence
We are now in the presence of a new medium disguised as good old text, but that text has been generated by an LLM, without authorial intention—an aspect that, if known beforehand, completely changes the expectations and response a human should have from a piece of text. Should our interpretation capabilities be engaged? If yes, under what conditions? The rules of the language game should be spelled out; they should not be passed over in silence.





© ACM, Inc. All Rights Reserved.