Data Gap Analysis in America’s Community Bankers

Words: 5068 Pages: 18

A massive volume of data needs to be specially handled in almost any organization when tapping value to be used in decision-making. This paper mainly explores how big data found in organizations should be used for it to be an essential input when making decisions (Sill 2016). The document explains how data should be handled within the shortest time possible to make real-time and clever decisions in an organization. It describes a personally derived framework that the top managers of America’s Community Bankers (ACB) in collaboration with their junior employees should implement when handling data to make decisions that will place the firm at an added competitive advantage.

Data Gap Analysis

Background

America’s Community Bankers is a community initiative representing the community banks’ grievances and saving institutions to the necessary authority. It was started in 1992 when the National Council of Community Bankers merged with the U.S. League of Savings institutions (America’s Community Bankers 2007). Initially, it was known as Savings and Community Bankers of America Community, but it later changed its name to America’s Community Bankers in 1995 (America’s Community Bankers 2007). It mainly operates in the community banks owned by American citizens. The primary issue that the organization has to deal with now is understanding the financial needs of the American residents and businesspeople in the areas where banks operate.

Data sources and datasets available for America’s Community Banker’s

Research Information Systems (RIS) which have many digital libraries is a significant data sources for America’s Community Bankers. It has software that enables data exchange through programs such as the ACM portal and Springer Link. Another source of data known as SOD simplifies the tasks of selecting tedious data and employs the seismology method of data selection (García et al. 2017). CALC is a dataset available for America’s Community Bankers, which allows the combination of data.

Table 1. Key financial and non-financial types of data in the case study

No	Data source	Associated datasets	Financial or non-financial	Business units or departments using the data
1	NAMEHCR	RIS	Financial	Bank holding companies
2	CB	CALC	Financial	Community banks
3	Max deposits	SOD	Financial	Banks headquarters

Data integrity and current of potential gaps in data analytics and data protection

Data integrity is a mandatory requirement for America’s Community Bankers for adherence to its ethical requirements. For any organization to ensure that it maintains its integrity and status, it must ensure it has the right business model for data protection and privacy. (Pullonen et al. 2019). They are required to respect the American bankers’ privacy and never use their information without their consent.

Organization Data Sources Business functions

Figure 1. Mapping between business function and data sources

Table 2. The identified gaps in data management

	Data protection/ ethics requirement	Procedures to be implemented in America’s Community Bankers	Relevant data protection standards	References to literature
1	Consent	Seek consent to use an individual’s data	Seeking explicit consent before using someone’s data.	Principles of data management- facilitating information sharing, Edinburgh, Scotland
2	transparency	How the data will be used. Who will use the data?	Publishing of a statement covering the transparent information.	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016)
3	Physical and IT security	Taking security measures to prevent data from reaching unauthorized hands.	Installation of super-powered and secure software.	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016)

Recommendations to America’s Community Bankers’ data analytics processes

Reorganization of the current data-driven strategies to streamline and enhance the data analytics and decision-making.

The data analysis and decision-making can be enhanced by first defining the questions and ensuring that they are right. The organization should ensure that the questions are concise, clear, and measurable (Cleff 2021). The questions should be designed to provide high chances of finding the solutions within the shortest time possible. The organization should also come up with manipulated ways of analyzing and interpreting the data (Janssen et al. 2017). After analyzing the data, the results should be interpreted to answer the original questions and defend against all objections that may arise.

Table 3. Proposed data analytics

no	Data source	Specific organizational decisions	Decision type
1	General public	Election of officials. Policymaking	Operational
2	Meeting minutes	How the organization will be governed.	Strategic
3	Program operations	Coming up with suitable production processes.	Tactical

Roadmap to the development or enhancement of extensive data infrastructure

Firstly, the American Community Bankers should evaluate the nature and form in which the data exists and identify which elements they need to operate for their system to work perfectly. The organization should then carry out a diagnosis of the actions and the decisions they plan to make (Chen et al. 2021). The organization should evaluate whether the decisions and steps they are planning to make are effective and the barriers that might exist. The last step the organization should take is to strategize by identifying plans, actions, and new theories which they will adopt.

Table 4. Data analytics implementation process

no	The phase of the extensive data analytics process	Activities to be implemented in the American Community Bankers organization
1	Contextual	Improvement of the services to their customers.
2	Diagnostic	Creation of good perceptions about their organizations to the community.
3	Evaluative	Come up with ways to achieve their objectives.
4	Strategic	Come up with more efficient and effective ways to achieve their objectives.

Compliance aspects of the proposed changes in data analytics

The American Community Bankers should adopt data protection policies and maintain ethics when handling data for the organization to comply with the law. The organization should ensure that it has adhered to the regulations set up by the General Data Protection Regulation (GDPR).

Table 5. Data protection and ethical compliance

	Data protection/ ethics requirement	Procedures to be implemented in the chosen organization or project	Relevant data protection standard	References to literature
1	Accountability and transparency	Providing plain language to individuals.	The individual whose data is being used should be asked for permission by the organization.	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016)
2	Lawful grounds for the processing	Personal data should only be used on lawful grounds (Jin and Kim 2018).	GDPR	Cloud, data, and business process standards for manufacturing, standards now column. IEEE computer society (Sill 2016).

How the processed data can be used in the organizational decision-making

Table 6. Supported business decisions

Business decision	Decision type
The top management will be identifying all the requirements of each task for the cost of production to be well managed. Identifying the most suitable and relevant software.	Strategic
Selecting the most suitable equipment to be used within the organization. Routing and batching.	Tactical
Come up with ways to improve the skills of the employees and training of new employees. Implementing the most suitable and cost-effective production methods (Shamim et al.2019). Come up with the most effective and efficient working hours for each level of employees (Shi et al.2021).	Operational

For an organization to ensure that data management is effectively achieved, it can apply the operational strategy. This can be accomplished by coming up with ways to improve the skills of the employees and training new employees. This way, the company ensures that its traditional methods of data management are maintained and updated in accordance with technological advancements. By offering training, new employees’ skills in big data management are up to standard with the company requirements.

Executive Summary

Big data has become highly proliferated in the contemporary business environment. In the past, this information did not help organizations much. However, with technological advancement, and with the growth of online databases and information processing software, big data is now driving crucial decisions and innovation. Notably, a firm’s ability to utilize big data effectively to understand current trends and change operations depends on that company’s willingness to examine the existing data and use corporate performance management (CPM) software to organize, filter, analyze, and interpret the data, among other things (Pulakos, 2004; Corporate performance management, n.d.). CPM is essential in this regard because it is cheap, accurate, and customizable. In this paper, the author talks about the Independent Community Bankers of America’s use of data about commercial bank loans to American businesses and citizens to streamline its operations through effective decision-making. The article also provides a critical discussion of the confidence interval, Chi-square, and ANOVA as specific examples of inferential statistics that corporate organizations can use for decision making (Tabachnick and Fidell, 2007; Sharpe, 2015). The decision that the author chose pertains to the Independent Community Bankers of America’s business expansion. Since the outcomes show a positive trend in loan taking, the recommendation is that the company focuses on loan-related services to individual and institutional customers.

Data Preparation Process

For an organization to succeed in today’s fast-paced environment, it must make strategic decisions in a timely and efficient way. Leaders need reliable and actionable data to make these strategic decisions and inspire organizational growth and development (Nakagawa and Cuthill, 2007). Financial institutions have significant customer data, but most of them are not utilizing this information properly and to their advantage (Nick, 2007). Notably, the struggle that companies have to convert data into critical business leads lies in their inability to organize and prepare the existing data. Luckily, there is a variety of software tools on the market that companies can use to organize their data. Most of these pieces of software contain procedures for collecting, filtering and preparing datasets.

Notably, corporate performance management (CPM) is the umbrella term describing all pieces of software that companies can use to automate their business operations. Specifically, CPM automates such core business functions as budgeting and forecasting, thereby enabling organizations to execute strategic goals promptly (Integrated business intelligence and performance management: What you need to know, n.d.; Corporate performance management: A tool for formulating organizational strategies, n.d.). It allows companies to define and analyze strategic options before implementation to ensure that the recommended course of action agrees with company finances, business model, operational approaches, and key performance indicators. Any existing CPM software will help the user collect, filter, and prepare data for analysis (Corporate performance management, n.d.). CPM saves the company time and money by preventing employees from entering information manually and tracking it using less specialized software.

The continued utilization of CPM can reduce budgeting, analytics, forecasting, and planning time by 50 to 70 percent. It also reduces the time needed for data preparation, report production, and assessment of performance. Understanding descriptive statistics adds value to business processes and makes big data more meaningful to the company (Fisher and Marshall, 2009; Du Prel et al., 2009). Once the CPM software is up, no major changes will be needed to make it function. Instead, users will input daily figures during their usual working times, and the software will organize the data and generate graphs or reports whenever needed. Companies such as the Independent Community Bankers of America have information about banks and their clients. Using CPM, the company easily manipulates available data to see trends over the years and predict possible future performance.

Rapid consolidation of data is also possible within any organization that has integrated CPM into its operations. During the life of a company, specific changes that cause significant disruption of data may occur. However, with CPM, such transitions will be seamless, and the company will need only focus on delivering its promise to the customers and maintaining its bottom line. The powerful analytics of the enhanced CPM software promote and enhance data-driven decision-making, which was once a difficult and costly undertaking (Data modeling: Conceptual, logical, physical, data model types, n.d.; Data-driven decision-making: Succeed in the digital era, n.d). Companies in the past maintained disorganized data, and they hardly used it for business enhancement because no CPM that could simplify the job existed. Rather than spend a significant amount of time and resources cleaning historical data and analyzing them using less-sophisticated tools to see possible trends, these past companies preferred instead to use trial and error and to base their forward decisions on the CEO’s or top manager’s recommendations.

Big data has helped the Independent Community Bankers of America understand many important financial data in the United States. For example, the Independent Community Bankers of America know the amounts of business loans extended by American commercial banks to the public from as far back as the Great Depression of the 1930s. Its CPM has carefully and adequately organized this data such that one can see specific trends in how Americans have been borrowing loans over the years. The good thing about the arrangement of the data is that one can select any data points and examine them for a more effective case study. For example, a look at the data about business loans in the United States by month between 2010 and 2020 shows a general upward trend in the loans extended to the public as illustrated in Figure 1. With this graph, it is evident that more American people and companies are taking loans, and financial institutions could leverage the same data to improve performance.

Figure 1: Changes in loaned amount (in billion dollars) between 2010 and 2020

The data is representative of the American population because it is drawn from all American commercial banks. Therefore, it includes all the information about all loans extended by American commercial banks to individuals and businesses throughout the United States. The main disadvantage of this data is that it does not show the specific individuals or groups that took the most loans. As such, it is difficult for a user to target specific loaners with good credit history. For that decision, one will need to access another dataset about individual creditworthiness. However, the data presented here is highly generalizable because it is derived from a large sample. Indeed, the sample from which the data is drawn is almost the same size as the population.

Data Modeling

Descriptive Statistics

Data is useful only when knowledgeable and skilled individuals analyze them and interpret the results meaningfully. One way that the Independent Community Bankers of America can convert its mere data about amounts of loans borrowed by American citizens and businesses is to perform a confidence interval. Doing so will help the company gain some insights into the probable loaning trends in the coming years (Marshall and Jonker, 2010; Larson, 2006). The confidence interval gives a clear picture of how much confidence there is in a given data. It is obtained by getting the mean and the standard deviation to determine the general characteristic of a given data set. Mathematically, the confidence interval CI is given by:

Formula

Where

is the sample mean,

the confidence level value,

the sample standard deviation, and

the sample size. Luckily, advanced CPM can calculate the confidence interval without the user needing to know this formula (Hazra, 2017; Bonett, 2006). In Microsoft Excel, one can determine the 95 percent confidence interval by using the alpha (0.05), the standard deviation, and the number of occurrences. For example, the data that the independent Community Bankers of America have about loans borrowed in each of the 12 months of 2010 (Figure 2) reveals the confidence interval in Figure 3.

Figure 2: Loan amounts in each month of 2020 (in billion dollars)

Figure 3: Confidence interval of loan range

The implication of the confidence interval is that American commercial banks could lend out 2716 billion dollars in the following months of 2021, plus or minus 141 billion dollars. The CI in this regard is the true mean, but there is a 5 percent chance that the upcoming loans would not fall within this range.

The Chi-square test is another piece of inferential statistic that one could derive from the data presented by the Independent Community Bankers of America. It is effective and reliable in testing categorical variables. According to McHugh (2013) and Franke, Ho, and Christie (2012), the mathematical notation of the Chi-square test (

) it is:

Where

is the Chi-square test,

the observed value, and

the expected value. Existing CPMs can calculate the Chi-square value and determine how categorical variables relate and are useful in hypothesis testing and formulation (Satorra and Bentler, 2010; Franke, Ho, and Christie, 2012). In Excel, a longer process is needed to arrive at the same answer, since one has to calculate manually the expected and observed values. For example, one can hypothesize that the loans that Americans received in each month of 2019 differed significantly from their 2020 receipts. The data from the Independent Community Bankers of America provides the following information for each month of the year for 2020 and 2019.

Figure 4: Monthly loan amounts for 2020 and 2019

The expected loan amounts for 2021 are given by multiplying the category column total by the result of dividing the category row total by the total sample size. Since the category row and the total sample size are the same in this scenario, the expected loan amount is given by multiplying the category column total by one. As seen in Figure 4, the expected value for 2020 (based on the 2019 results) for January was 2328.1491. However, the observed value for that period was 2359.0658. Thus, Chi-square is given by subtracting the expected value from the observed value, squaring the result, and dividing it by the expected value. The statistic proves that some correlation exists between the 2019 and the 2020 monthly loan amounts extended by US commercial banks to individuals and organizations.

Lastly, the analysis of variance (ANOVA) is also an important inferential statistic that could be calculated from the Independent Community Bankers of America’s data about the amounts of loans that commercial banks have extended to Americans over the years. ANOVA is important in statistical analysis because it splits data into systematic and random factors (Rutherford, 2001; Blanca et al., 2018). Since it calculates the treatment levels’ means and examines their similarity to the overall mean, they provide crucial information about the statistical difference between them (Plonsky and Oswald, 2017; Breitsohl, 2019). Figures 5 and 6 below summarize the analysis of variance for the 2020 data about the loans offered by US commercial banks to individuals. Figure 5 is the ANOVA single factor while Figure 6 is the ANOVA two-factor without replication.

Figure 6: ANOVA two factors without replication

Since the f-value is significantly large in both the single and to-facto ANOVA, the variability between group means in the loan data between January and December 2020 is larger. Usually, the P-value indicates the strength of the evidence for or against the null hypothesis. If it is low, then strong evidence exists against the null hypothesis. However, this information was not captured in the analyzed data in the ANOVA because the author did not determine the null and the alternative hypotheses before examining the data.

Justification of the Inferential Statistical Model

The use of the confidence interval, Chi-square, and ANOVA is justified by their significance in the study. It is difficult or impossible to do a statistical comparison without looking at what some of these inferential statistics provide. In big data analysis, knowledge of all the important inferential statistics ensures that one makes the right decision based on the available data (Mishra et al., 2019) In less-big data sets such as the one provided in this case study, the study of the confidence interval provides an illustration of the spread of the values making it easy for the user to approximate the range of the results and the likelihood that a given value will fall within the desired range. it helps with predicting future values in forecast scenarios. The Chi-square, on the other hand, provides information about the relationship between categorical variables. It is one of the most important inferential statistics for testing or formulating a hypothesis (Kaur, Stoltzfus, and Yellapu, 2018). It shows the relationship between the different statistical values included in a given dataset. Lastly, the analysis of variance (ANOVA) examines how the dataset variables, including the dependent and the independent ones, relate to each other (Faraway, 2002). In this examination, the provided dataset shows that critical decisions are possible if the corporation understands the meaning of its big data or how it is changing. The Independent Community Bankers of America can use its data to determine better ways of extending loans and making profits from its huge data. Table 6 summarizes the information.

Table 6. Comparison of different inferential statistical models

	Model or test	Advantages	Limitations	References to literature
1	Confidence interval	The main advantage is that it provides information about the range wherein the true value lies within a given degree of probability and direction.	The confidence interval does not generate a single mean estimate. Instead, it creates that average’s lower and upper limits to prove the range.
2	Chi-square	In addition to being easy to compute, Chi-square is advantageous also because it is utilizable with categorical data.	The main disadvantage of the Chi-square test, as is the case with many statistical measures, is that it is meaningless unless a person understands how to interpret it.
3	ANOVA	One of the advantages of ANOVA is that it provides the overall test of equality of means of groups and can control false-positive findings..	One-way ANOVA is useable only when an individual is investigating a single factor and a single dependent variable. Although it can tell if significant differences between means exist, it cannot pinpoint the exact pairs having those differences.

Initial Outcomes of the Inferential Analysis

The initial outcomes of the inferential analysis show that the Independent Community Bankers of America can use the data in its proposal about the amounts of loans extended to American people and businesses by American commercial banks to organize its services and benefit from it. For example, the trends prove that the amount of loans that American people and businesses have been taking over the years has increased significantly, and will continue to do so in the coming years. Therefore, theoretically, companies offering financial services like loans are likely to continue enjoying a good business well into the future (Asadoorian and Kantarelis, 2005; Big data: How could it improve decision making within your company? 2017). The only uncertainty that currently exists is that there is no confirmation about the most effective or reliable lender out there. That information could only be obtained from other datasets existing on the market.

Data used in this presentation proves that Americans will continue to take loans, even though it does not specify the preferred amounts or the people more likely to do so. Thankfully, because the data is highly representative, it is greatly generalizable back to the population from which it is drawn (Allua and Thompson, 2009; Amrhein, Trafimow, and Greenland, 2019). The generalisability of the data means that the information applies to almost all Americans, but in slightly different ways. Therefore, it is the responsibility of the user of the data to determine how well and how best to apply the information to get the maximum benefit from it. For example, although the data proves that Americans have been taking an increasing amount of loans yearly, it does not specify which Americans offer the least likely risks due to timely repayments. Thus, the company will need to examine individual loan applicants critically before approving any payments to them.

Further Visualisation and Interpretation of Results as Expected in Business Reporting

Further visualization of the data is possible depending on the user’s needs. It can happen through tabulation, diagram formation, and graphs to provide the most accurate representation of the data. Agreeably, the visualization of the statistical data is an important component of understanding big data and making sense of it. It simplifies the information and summarizes it for easy consumption and decision-making. Most importantly, visualized data are easy to consume and interesting to utilize. The data used in this example do not provide all possible scenarios for the application of big data in decision-making. However, they provide important consideration for the application of the same in business (Data-driven decision-making: A primer for beginners, n.d). With the utilization of big data, businesses these days increase the chances of success and continue to grow and expand. Data-driven decisions are also more accurate and help companies to, among other things, offer specialized services and run appropriate campaigns for each group of customers.

As noted earlier, the confidence interval lets the business data user approximate the tendency or characteristics of future data based on current ones. It is needed and important in predicting future trends. As shown in Figure 7, a 95 percent confidence interval shows the area within which future data is likely to lie, with only a 5 percent chance that it will not be in that area. For example, the data shows that the American commercial banks are likely to loan out 2716 billion dollars in the next month of 2021 based on historical performance, plus or minus 141 billion dollars. It is the true mean of the data, with only a five percent chance that it is not true.

As shown in Figure 8, The Chi-square test also offers comparable information to the confidence interval. It shows the similarities and differences between the expected and the observed values, thereby depicting a more accurate relationship between variables. The Chi-square test is particularly important in forming and testing hypotheses. Therefore, it encourages users to visually analyze data and make hypotheses that could lead to critical business decisions. Notably, statisticians sometimes suspect the existence of a relationship between different variables. By creating null and alternative hypotheses, and generating the Chi-square values, they can determine if their suspicions are true or false.

Lastly, the analysis of variance offers a more critical examination of differences between variables. It is also useful in testing or formulating hypotheses. The data included in this exercise provides sufficient details of the utilization of ANOVA in decision-making. As shown in Figure 9, the statistic shows variations between and within groups. The data differences in these groups and outside can provide critical insight into how future statistics based on the same information could appear. Based on the data about commercial loans extended to Americans, it is evident that the amounts will continue to rise in the future, with less significant changes happening between and within groups. For instance, it is expected that the number of monthly loans extended to Americans will continue to rise, translating into even more annual loans for them. Businesses that offer services in loans can use this information to prepare themselves adequately and position themselves better to benefit from these changes. What matters is the ability to understand trends and move efficiently and effectively with them, rather than trying to change or stop them.

Recommendations for the Decision-Making Process

Table 7. Decision recommendation

Business decision	Related business question		How this business question should be answered	Decision type (strategic, tactical, operational)
Focusing on individual and business loans	How can the Independent Community Bankers of America increase their participation in the provision or review of business and personal loans in the United States?	The company can increase its participation in business loans by offering greater services about linking customers with the cheapest loans in the country. Because customers across the country are looking for loans, the Independent Community Bankers of America can connect them with financial institutions that offer the most affordable ones.		The decision to connect customers and financial institutions is strategic and operational. It is strategic in the sense that it will give the company an opportunity to collect more data and understand customers better to make even better service recommendations in the future. It is operational because it revolves around the core services that the Independent Community Bankers of America has been offering.
By increasing participation in credit services,	the Independent Community Bankers of America can also increase its participation in the provision of loan services to willing consumers to help them make the right decisions. For example, many people could be in need of loans by are not aware of how or where to start. Others are also afraid of taking loans, not understanding that borrowed money can act as reliable investment tools.		One way the company could increase its participation in the credit services sector is to create products that educate people about loans. The new service will help loan takers and prospecting clients to plan their finances well and benefit from the borrowed money. It will reduce the likelihood of defaults and similar events that can negatively affect the client’s reputation.	The decision to help customers select the right loan for themselves by giving then the information that helps them the plan is an important strategic and operational decision. It will give the company an opportunity to interact more with customers) and obtain more data) while participating in its core business services of connecting customers and American bankers.

Specific actions that the company can take to provide customers with even better information about loans include investing in a reliable CPM. Its acceptance by stakeholders can be achieved through extensive training and product popularization. Although people tend to resist change, telling them that adopting to current industry practices is the easiest way of remaining competitive and achieving growth. CPM will help the company to utilize even broader data more effectively leading to success. Extra data sources that could further augment the decision-making process is primary research, which entails asking customers questions directly to understand their opinions and expectations on various issues. Notably, the effectiveness of the applied analytical model depends on the quality of the data used and the analyst’s ability to make accurate and reliable predictions. Thus, the study assumes that the data presented concerning business loans that commercial banks have extended to customers are accurate. In the future, better outcomes can be achieved by confirming the effectiveness and reliability of the data. One can often do this by looking at the data sources and their past treatment, including storage and manipulation.

References List

Amrhein, V., Trafimow, D., and Greenland, S. (2019) ‘Inferential statistics as descriptive statistics: There is no replication crisis if we don’t expect replication’, The American Statistician, 73(1), 262-270.

Allua, S., and Thompson, C. B. (2009) ‘Inferential statistics’, Air Medical Journal, 28(4), 168-171.

America’s Community Bankers, 2007. America’s Community Bankers: Celebrating 115 Years of Leadership 1892-2007. Washington DC: America’s Community Bankers.

Asadoorian, M. O., and Kantarelis, D. (2005) Essentials of inferential statistics. Lanham, Maryland: University Press of America.

Blanca, M. J. et al. (2018) ‘Effect of variance ratio on ANOVA robustness: might 1.5 be the limit?’, Behavior Research Methods, 50(3), 937-962.

Big data: How could it improve decision-making within your company? (2017) Web.

Breitsohl, H. (2019) ‘Beyond ANOVA: an introduction to structural equation models for experimental designs’, Organizational Research Methods, 22(3), 649-677.

Bonett, D. G. (2006) ‘Confidence interval for a coefficient of quartile variation’, Computational statistics & data analysis, 50(11), 2953-2957.

Chen, S., Liu, J., Xu, Y. (2021) ‘A logical reasoning based decision-making method for handling qualitative knowledge,’ International Journal of Approximate Reasoning, 129, 49-63.

Cleff, T. (2021) Applied statistics and multivariate data analysis for business and economics. 1st edition. Cham: Springer.

Corporate performance management (n.d.) Web.

Corporate performance management: A tool for formulating organizational strategies (n.d.) Web.

Data-driven decision-making: A primer for beginners (n.d) Web.

Data-driven decision-making: Succeed in the digital era (n.d.) Web.

Data modelling: Conceptual, logical, physical, data model types (n.d.) Web.

Du Prel, J. B. et al. (2009) ‘Confidence interval or p-value? Part 4 of a series on evaluation of scientific publications’, Deutsches Ärzteblatt International, 106(19), 335-349.

Faraway, J. J. (2002) Practical regression and ANOVA using R. Bath: University of Bath Press.

Fisher, M. J., and Marshall, A. P. (2009) ‘Understanding descriptive statistics’, Australian Critical Care, 22(2), 93-97.

Franke, T. M., Ho, T., and Christie, C. A. (2012) ‘The chi-square test: often used and more often misinterpreted’, American Journal of Evaluation, 33(3), 448-458.

García, D., Nebot, A., and Vellido, A. (2017) ‘Intelligent data analysis approaches to churn as a business problem: a survey,’ Knowledge and Information Systems, 51, 719-774.

Hazra, A. (2017) ‘Using the confidence interval confidently’, Journal of thoracic disease, 9(10), 4125-4136.

Integrated business intelligence and performance management: What you need to know (n.d.) Web.

Janssen, M., van der Voort, H., and Wahyudi, A. (2017) ‘Factors influencing big data decision-making quality,’ Journal of Business Research, 70, 338-345.

Jin, D. and Kim, H. (2018) ‘Integrated understanding of big data, big data analysis, and business intelligence: a case study of logistics. Sustainability, 10(10), 3778.

Kaur, P., Stoltzfus, J., and Yellapu, V. (2018) ‘Descriptive statistics’, International Journal of Academic Medicine, 4(1), 60-71.

Larson, M. G. (2006) ‘Descriptive statistics and graphical displays’, Circulation, 114(1), 76-81.

Marshall, G., and Jonker, L. (2010) ‘An introduction to descriptive statistics: a review and practical guide’, Radiography, 16(4), e1-e7.

McHugh, M. L. (2013) ‘The chi-square test of independence’, Biochemia Medica, 23(2), 143-149.

Mishra et al. (2019) ‘Descriptive statistics and normality tests for statistical data’, Annals of cardiac anesthesia, 22(1), 67-83.

Nakagawa, S., and Cuthill, I. C. (2007) ‘Effect size, confidence interval and statistical significance: a practical guide for biologists’, Biological Reviews, 82(4), 591-605.

Nick, T. G. (2007) ‘Descriptive statistics’, Topics in biostatistics, 10(2), 33-52.

Plonsky, L., and Oswald, F. L. (2017) ‘Multiple regression as a flexible alternative to ANOVA in L2 research’, Studies in Second Language Acquisition, 39(3), 579-592.

Pulakos, E. D. (2004) Performance management: a roadmap for developing, implementing and evaluating performance management systems. Alexandria, VA: SHRM Foundation.

Pullonen, P. et al. (2019) ‘Privacy-enhanced BPMN: enabling data privacy analysis in business processes models,’ Software and Systems Modelling, 18, 3235-3264.

Rutherford, A. (2001) Introducing ANOVA and ANCOVA: a GLM approach. London: Sage.

Satorra, A., and Bentler, P. M. (2010) ‘Ensuring positiveness of the scaled difference chi-square test statistic’, Psychometrika, 75(2), 243-248.

Shamim, S. et al. (2019) ‘Role of big data management in enhancing big data decision-making capability and quality among Chinese firms: a dynamic capabilities view,’ Information & Management, 56(6), 103135.

Sharpe, D. (2015) ‘Chi-square test is statistically significant: now what?’, Practical Assessment, Research, and Evaluation, 20(1), 8-12.

Shi, W. et al. (2021) ‘Handling uncertainty in financial decision making: a clustering estimation of distribution algorithm with simplified simulation. IEEE Transactions on Emerging Topics in Computational Intelligence, 5(1), 42-56.

Sill, A. (2016) ‘Cloud, Data, and Business Process Standards for Manufacturing,’ IEEE Cloud Cmputing, 3(4), 74-80.

Tabachnick, B. G., and Fidell, L. S. (2007) Experimental designs using ANOVA. Belmont, CA: Thomson Brooks Cole Publishers.