November 27, 2023
Volume 21, issue 5

Download PDF version of this article PDF

Low-code Development Productivity

"Is winter coming" for code-based technologies?

João Varajão, António Trigo, Miguel Almeida

Over the past few years, the software development world has been witnessing the advent of low-code technologies. These technologies promise significant improvements, both in terms of the development process and productivity, but evidence of these improvements is practically nonexistent in the research literature. This may lead you to question whether the promised gains are just propaganda from software vendors.

Productivity in the context of software-development technologies can be defined as the efficiency made possible in the production of software goods or services. Productivity is typically expressed by a measure that relates output, input, and time. The Cambridge Dictionary defines productivity as "the rate at which a company or country makes goods, usually judged in connection with the number of people and the amount of materials necessary to produce the goods"; similarly, the Oxford Dictionary defines it as "the rate at which a worker, a company, or a country produces goods, and the amount produced, compared with how much time, work, and money is needed to produce them."

As you might expect, productivity is a critical issue in software development. The emergence of low-code, extreme low-code (quasi no-code), and no-code technologies is precisely grounded in the search for greater efficiency and effectiveness in software development.

The main arguments put forth on the websites of software houses that produce and market low-code technologies largely focus on productivity improvements:

"Create your apps in less time with our powerful tools, templates, connectors all in a single platform." (OutSystems)¹³
"... you should expect to build your apps 10x faster, reduce your maintenance costs by 50%, and gain superior functionality as compared to traditional development." (Appian)¹
"Up to 10x faster developing new projects with 1/10 of resources." (Quidgest Genio)¹⁵
"Allows you to go live sooner and get to success faster." (Mendix)¹⁰
"Build apps crazy fast without code." (TrackVia)²⁰
"Productivity is 8 times higher than low code platforms." (Quidgest Genio)¹⁵
"... maximize your resources and dramatically improve your business results." (Appian)¹
"It’s easy to build and integrate high quality applications in days, not weeks." (TrackVia)²⁰
"... change the way software is built, so you can rapidly create and deploy critical applications that evolve with your business." (OutSystems)¹³
"Create better software faster by abstracting and automating the development process." (Mendix)¹⁰

Overall, this seems too good to be true, and it is important to separate what is "advertising" from what is "achievable" for companies that are weighing whether to adopt this technology for their software development.

This article aims to provide new insights on the subject by presenting the results of laboratory experiments carried out with code-based, low-code, and extreme low-code technologies to study differences in productivity. Low-code technologies have clearly shown higher levels of productivity, providing strong arguments for low-code to dominate the software development mainstream in the short/medium term. The article reports the procedure and protocols, results, limitations, and opportunities for future research (expanding the results of Trigo, et al.²¹).

Background

Source code is the set of logical instructions that a programmer writes when developing an application. Once written, these instructions are compiled/interpreted and converted into machine code. High-level programming languages such as Python, Java, JavaScript, PHP, C/C++, C#, etc., are examples of technologies used in code-based application development.

Low-code software development, on the other hand, consists of minimizing the amount of manual coding by using support tools. The objective is to develop software faster and with less effort on the part of development teams, thus accelerating software delivery.

Examples of low-code/no-code software-development technologies are IBM Automation Platform, Zoho Creator, Appian, Mendix, OutSystems, AgilePoint, Google AppSheet, Nintex, TrackVia, Quickbase, ServiceNow, Salesforce App Cloud, Microsoft Power Apps, Oracle Visual Builder, Oracle APEX, and Quidgest Genio, to name just a few. The distinctive feature of these technologies is that they allow the creation of software applications with minimal hand-coding.^{22, 23}

Typically, low-code platforms provide a graphical environment that facilitates application development, unlike code-based technology, which requires manual coding (i.e., almost everything is developed graphically in low-code technologies, with little or no programming, allowing people with no programming competencies to create software applications). One disadvantage of these technologies are the licensing costs, which are known to be higher than for code-based technologies.¹⁷

Method

In this research, laboratory experiments were performed in a controlled environment, following a previously defined procedure and protocols to enable accurate measurements.² The experiments were designed to be objective so there is no bias in the results (e.g., resulting from the researchers’ influence/perspective).⁹

The underlying research question was: Do low-code technologies result in higher software-development productivity than code-based technologies (as reported in the gray literature)? The variable under study was productivity in the creation and maintenance of software applications.

For each experiment, a software-development technology was selected (code-based, low-code, or extreme low-code (quasi no code)), and one developer with proven proficiency in that technology was invited to participate. In the case of code-based technology, the developer’s preferred technology was selected. The productivity calculation was based on the UCPA (use case points analysis) method.¹²

The artificial and controlled environments of the experiments made it possible to accurately measure execution times; this is impossible in other types of studies, such as field experiments, in which it is not viable to control all external stimuli that condition the performance of tasks.²⁴

The experiments were structured into five stages:

0 Experiment design

I Briefing

II Software application development (creation)

III Software application development (maintenance)

IV Results analysis

Stages I, II, and III were repeated for each technology involved in the experiments.

Stage 0 – Experiment design

Stage 0, the preparatory phase for the various experiments to be performed, was carried out only once. During this stage, the procedure to be followed was defined; the protocols that specify the application to be developed and maintained (structured in two stages) were created; and the methods to be used to estimate and measure productivity were specified. The protocols are available for download at https://doi.org/10.5281/zenodo.6407074.

The UCPA method was chosen from the several possible alternatives (e.g., lines of code,¹⁴ COCOMO II (Constructive Cost Model),¹⁹function point analysis,⁸ etc.), because of its focus on the functionalities of the applications to be developed and independence of the technology to be used (which, in the case of the defined experiments, is fundamental).

The method consists of the following phases:^2,3,11

1. Calculation of the UUCP (unadjusted use case points) variable, using the variables UAW (unadjusted actor weight) and UUCW (unadjusted use case weight), respectively related to the perceived complexity of actors and use cases: UUCP=UAW+UUCW

2. UUCP adjustment, considering a set of factors of a technical and environmental nature reflected in the variables TCF (technical complexity factor) and EF (environmental factor). The combination of variable UUCP with variables TCF and EF results in the assessable UCP (use case points) of the project: UCP=UUCPxTCFxEF

3. Finally, the UCP variable is multiplied by the PF (productivity factor), which represents the number of hours necessary for the development of each UCP: Total Effort=UCPxPF

Thus, with the UCPA model as a reference, the PF variable was calculated: The lower the resulting PF, the higher the productivity of the technology under study.

The experiment was structured in two main parts: the first part (stage II) created a software application; and the second part (stage III) consisted of the maintenance (corrective and evolutionary) of that application.

Appendices A.1, A.2, and A.3 identify the actors and use cases described in the experiment protocols, as well as their respective scores (weight).

For the first part of the experiment (creation of a software application), TCF was given a value of 1, considering the low application complexity. Given that the purpose of the experiment was to determine the EF value for each technology, the starting point for calculating the UCP variable was also set at 1. Thus, for the first part (stage II) of the experiment:

UUCP=UAW+UUCW=125+9=134

UCP=UUCPx1x1=134x1x1=134

For the second part of the experiment (maintenance), participants were asked to make two changes (corresponding to a weight of 20 points) and to implement new use cases (also corresponding to 20 points), as shown in appendix A.3. Thus, in total, for the second part (stage III) of the experiment:

UUCP=UAW+UUCW=40+9=49

UCP=UUCPx1x1=49x1x1=49

Throughout each experiment, a researcher was always present. Whenever requested by the developer, additional clarifications were provided on the application to be developed. It should also be noted that the experiments were fully recorded on video for subsequent analysis. Break times (e.g., for meals) were registered but not considered for productivity calculation. During the experiments, the developers could access all the information they needed; the only restriction was that they could not contact other developers for help.

Stage I – Briefing

Stage I was preparatory and consisted of presenting the protocol and the conditions for conducting the experiment to the developer. The use cases were presented in detail, as well as the mockups and data-model requirements. The degrees of freedom were also defined—for example, regarding the color scheme of the graphical interface.

The importance of the final application being as close as possible to the mockups was duly stressed, as well as the need for strict compliance with the specifications—developers were told to resist the temptation that "it would be better in any other way"—since a quality assessment in the final stage of the experiment was planned to consider these very aspects. Time measurement started after the completion of this phase.

Stage II – Software application development (creation)

The objective of stage II was to create a new application, following the protocol defined in the first part of the experiment. Each developer’s activities were recorded on video, and one of the team’s researchers was always present during this stage. Besides the programming corresponding to the defined use cases, the activities performed by the developer included the configuration of the development environments used, the creation of databases, and testing. It should be noted that the complementary activities varied significantly depending on the development technology used.

Stage III – Software application development (maintenance)

Stage III followed the same procedure as stage II, except that the objective was not the creation of a new application but the maintenance (corrective and evolutionary) of an existing application (the one created in stage II). Moreover, the activities were based on a new protocol and requirements (see appendix A.3), which was made available only after completing stage II (i.e., in stage II, the developers were not aware of the protocol for stage III).

Stage IV – Results analysis

After completing the experiments, the time records (registered manually) and the videos of the activities performed were checked to ensure the accuracy of the time counting. Furthermore, to promote greater accuracy in the calculation of the productivity made possible by each technology, a quality assessment of each resulting application was performed with the participation of at least two researchers, considering four fundamental criteria: compliance with the mockups; fulfillment of the functionalities as described in the use cases; occurrence of errors; and application performance. Note that although quality assessments of the various applications resulted in minor differences in the final productivity calculated, this had no significant expression in productivity differences among the various technologies that were part of the experiments or in the overall conclusions of the study.

Results

Three experiments were conducted using the most recent versions of the selected technologies: code-based (Django/Python⁴); low-code (OutSystems¹³); and extreme low-code (Quidgest Genio¹⁵). All the participating developers (one per experiment) were experienced in using the target technology in a professional context. The researchers’ contacts for accessing the participants determined the selection of the low-code technologies. In the case of the code-based technology, the participant chose Django/Python (he also had experience with several others, including PHP, C#, etc.). All the participants were familiar with the experiment domain (aware of the involved concepts) and the type of application to be developed.

For each experiment, the results (presented in table 1) are based on the variables:

QF (quality factor)
Implemented UC (considering QF)
Time (hours)
PF (without considering QF)
PF (considering QF)

Additionally, for each variable, the results of stage II (software application creation) and stage III (software application maintenance) are presented, as well as the experiment as a whole (total).

The QF variable is related to the quality of the final product and was determined considering four fundamental criteria: (1) compliance with the mockups; (2) fulfillment of the requirements as described in the use cases; (3) occurrence of errors; and (4) application performance.

For example, if an application had a minor deviation in the implementation of a particular use case compared with the respective mockup, without any impact on functionality, the QF variable corresponding to that use case would be penalized by 5 percent. In the case of an error inhibiting the use of the functionality, however, the penalty could go up to 100 percent.

The QF variable’s final value (per technology) results from the weighted average of the application’s overall quality (considering the weights of the use cases). For example, a QF of 0.9 can be interpreted as the application meeting 90 percent of the specification described in the corresponding protocol. Two researchers reviewed and applied a test script created to reduce bias in quality assessment. In the end, the application performance criterion was not considered because no differences were identified among the resulting applications. If there had been such differences, it could be because of web server capacity and not the involved technology.

Thus, the implemented UC variable (considering QF) corresponds to the UC effectively implemented and is calculated by multiplying the UUCP variable of the experiment by the QF variable.

The time variable corresponds to the creation/maintenance time of the application, measured in hours.

The PF variable (without considering QF) consists of the calculated productivity factor, having as reference only the UUCP of the experiment (that is, UUCP/Time); this variable ignores the degree of compliance (QF) with the specification in the protocol.

Finally, the PF variable (considering QF) consists of the calculated productivity factor, having as reference the implemented UC variable (considering QF). Thus, this variable better reflects productivity, since it takes into account the UC effectively implemented (considering the QF) and not simply those specified in the experiment’s protocol.

Discussion and Conclusion

By first analyzing the QF variable, it is possible to verify that, in the case of code-based and low-code technologies, a degradation of the application’s quality was noted from the first part (stage II) of the experiment to the second part (stage III). The same did not happen in the case of extreme low-code technology. Given the nature of the changes in the protocols of the experiment, this should not be attributed to the technologies under study, but mainly to the limited testing carried out by the developers.

For example, in stage III, the PF of the application maintained with low-code technology was penalized in a use case implementation because of a coding error that caused the application to abort its normal operation. Nevertheless, globally, low-code and extreme low-code technologies allowed the development of more robust applications in this experiment. It is important to stress, however, that regardless of the technology, rigorous testing cannot be disregarded in the software-development process.

Considering the PF variable, only in the case of the code-based technology was there an improvement from stage II to stage III. This aspect must be put into perspective when comparing it with low-code technologies, since in the case of stage II of the experiment, the code-based technology required a lot of time for setup activities (e.g., database creation), which did not have to be repeated in stage III. Low-code technologies have been shown to be more effective in setup activities. Therefore, the total values (the Totals column in table 1) better reflect the reality of the experiment.

Tables 2 and 3 present a comparison of the productivity verified in the various experiments. Table 2 does not consider the QF variable, whereas table 3 presents the differences by considering it. Although considering the QF variable gives more precision to the measurements, the comparison was included without considering the quality, to verify if it influenced the global conclusions. Results show that the findings of the experiments remain the same, regardless of considering QF.

Overall, in these experiments, low-code technologies have shown considerably higher productivity than code-based technology, ranging from about a threefold to a tenfold increase in productivity.

This expands prior work²¹ and is in accordance with some gray literature reports, which state that developing applications using low-code technologies accelerates the process,⁵ resulting in faster delivery and higher productivity.^6,16 For example, research by Forrester shows that low-code platforms speed up development about five to ten times.¹⁸

According to Gartner,⁷ low-code will account for more than 70 percent of software-development activity by 2025. This article presents one of the first research-based studies focused on productivity differences among types of development technology.

It is not without limitations, however. First, the selected technologies do not represent "all" extant low-code and code-based technologies. They include some of the most popular technologies, but many more could be part of the experiments.

Second, the experiments’ protocols specify a "management software" application—and there are many other types, such as multimedia. It would be interesting to study the "fit" of the different technologies, considering the application type to be developed.

Third, the protocols for developing/maintaining the application software were designed to be implemented in a short period of time by a single developer. Since the software development activity is often a collaborative process, this opens space for further research.

Finally, the participants in the experiments were all experienced developers familiar with the specific technologies they used. Their different profiles could be a source of bias.

Overall, these limitations may have a small influence in the recorded times, but do not put the conclusions into question, since low-code technologies have clearly shown higher levels of productivity. Nevertheless, further studies are required to enrich the knowledge base about these technologies’ productivity (the full procedure and protocol details for replication studies are at the sites.google.com/view/sdtproductivity.)

As stated by Varajão,^22,23 "Low-code, extreme low-code, and no-code software development, supported by innovative technologies such as artificial intelligence are expected to accelerate rapidly toward worldwide adoption as major enablers of digital transformation." The productivity differences found in these experiments clearly provide strong arguments for low-code technologies to dominate the software-development mainstream in the short/medium term.

Acknowledgments

The authors would like to thank the study participant developers. Special thanks to James Maurer for all his support (and patience). We also note that this study did not receive any specific grants from the public, commercial, or not-for-profit domains.

Appendices

References

1. Appian; https://appian.com.

2. Balijepally, V., Mahapatra, R., Nerur, S., Price, K. H. 2009. Are two heads better than one for software development? The productivity paradox of pair programming. MIS Quarterly 33(1), 91–118; https://dl.acm.org/doi/10.5555/2017410.2017418.

3. Clemmons, R. K. 2006. Project estimation with use case points. Crosstalk, The Journal of Defense Software Engineering 19(2), 18–22; https://www.researchgate.net/publication/200036324_Project_Estimation_With_Use_Case_Points.

4. Django; https://djangoproject.com.

5. Forrester Consulting. 2019. Large enterprises succeeding with low-code.

6. Gartner. 2020. The 2020 Gartner Magic Quadrant for enterprise low-code application platforms.

7. Gartner. 2021. Forecast analysis: low-code development technologies; https://www.gartner.com/en/newsroom/press-releases/2022-12-13-gartner-forecasts-worldwide-low-code-development-technologies-market-to-grow-20-percent-in-2023.

8. Lokan, C. 2005. Function points. Advances in Computers 65, 297-347; https://www.sciencedirect.com/science/article/abs/pii/S0065245805650073.

9. McLeod, S. 2012 (updated 2023). Experimental method. Simply Psychology; https://www.simplypsychology.org/experimental-method.html.

10. Mendix; https://mendix.com.

11. Nageswaran, S. 2001. Test effort estimation using use case points. Quality Week; https://www.researchgate.net/publication/228954898_Test_effort_estimation_using_use_case_points.

12. Ochodek, M., Nawrocki, J., Kwarciak, K. 2011. Simplifying effort estimation based on use case points. Information and Software Technology 53(3), 200-213; https://www.sciencedirect.com/science/article/abs/pii/S095058491000176X.

13. OutSystems; https://outsystems.com.

14. Pressman, R., Maxim, B. 2020. Software Engineering: A Practitioner’s Approach, 9th edition, McGraw-Hill Education; https://www.mheducation.com/highered/product/software-engineering-practitioner-s-approach-pressman-maxim/M9781259872976.html.

15. Quidgest Genio; https://genio.quidgest.com.

16. Richardson, C., Rymer, J. 2016. Vendor landscape: the fractured, fertile terrain of low-code application platforms. Forrester Consulting; https://www.forrester.com/report/Vendor-Landscape-The-Fractured-Fertile-Terrain-Of-LowCode-Application-Platforms/RES122549.

17. Sahay, A., Indamutsa, A., Di Ruscio, D., Pierantonio, A. 2020. Supporting the understanding and comparison of low-code development platforms. 46th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 171–178; https://ieeexplore.ieee.org/abstract/document/9226356.

18. Sanchis, R., García-Perales, Ó., Fraile, F., Poler, R. 2020. Low-code as enabler of digital transformation in manufacturing industry. Applied Sciences 10(1), 1–17; https://www.mdpi.com/2076-3417/10/1/12.

19. Sommerville, I., 2018. Software Engineering, 10th edition. Pearson; https://www.pearson.com/en-us/search.html?aq=software%20engineering.

20. TrackVia; https://trackvia.com.

21. Trigo, A., Varajão, J., Almeida, M. 2022. Low-code versus code-based software development: Which wins the productivity game? IEEE IT Professional 24(5), 61–68; https://www.computer.org/csdl/magazine/it/2022/05/09967415/1IIYACGLXnq.

22. Varajão, J. 2021. Software development in disruptive times. acmqueue 19(1), 94–103; https://queue.acm.org/detail.cfm?id=3458743.

23. Varajão, J. 2021. Software development in disruptive times. Communications of the ACM 64(10), 32–35; https://cacm.acm.org/magazines/2021/10/255713-software-development-in-disruptive-times.

24. Wenz, A. 2021. Do distractions during web survey completion affect data quality? Findings from a laboratory experiment. Social Science Computer Review 39(1), 148-161; https://dl.acm.org/doi/abs/10.1177/0894439319851503.

João Varajão is a professor and researcher of information systems and project management at the University of Minho/ALGORITMI Research Centre/LASI. He has published numerous refereed publications and written and edited books, book chapters, and conference communications. He serves as editor-in-chief, associate editor, and scientific committee member for conferences and international journals. ORCID: 0000-0002-4303-3908.

António Trigo is a professor of management information systems at the Polytechnic Institute of Coimbra in Portugal and a researcher at the ALGORITMI Research Centre/LASI/University of Minho. He has published refereed publications and written books and conference communications, among other projects. Before joining academia, he worked as a software engineer and project manager. ORCID: 0000-0003-0506-4284.

Miguel Almeida received a master’s degree in management information systems in 2020 from the Coimbra Institute of Accounting and Administration at the Polytechnic Institute of Coimbra, Portugal. He works as a programmer at Deloitte Portugal. Previously he worked as software developer at Softinsa.

Originally published in Queue vol. 21, no. 5—
Comment on this article in the ACM Digital Library