Non-Linear Multivariate Analysis with Artificial Neural Network in Estimating Compression Index for Cohesive Soils of Northern Jakarta Coast

This study presents a novel application of artificial neural network (ANN) to develop a model for predicting compression index (C c ) of cohesive soils from their index properties. The model was trained using data from 347 undisturbed samples on a variety of cohesive soils from Northern Jakarta. It takes up to three variables as inputs: specific gravity (G s ), liquid limit (LL), and plastic limit (PL). The model was tested on a separate dataset of 117 samples and found to have a strong capability to predict C c values when compared to some reference correlations. The ANN model has demonstrated good performance for each set by producing overall error of 29.6%, compared to 38.1% and 30.5% for the empirical formulas. This study shows that the application of ANN offers an essential advancement in this area, helping to overcome the limitation of conventional statistical correlation.


INTRODUCTION
Soils are naturally compressible, be subjected to volume changes in response to applied stresses (Balasubramaniam & Brenner, 1981).This phenomenon consists of three phases: immediate, primary consolidation, and secondary consolidation settlements (Das & Sobhan, 2018).Soils with high plasticity are predominantly encountered in North Jakarta Bay's, and the soil often experiences high consolidation settlement.
Assessing soil consolidation properties through testing often require a lot of time and geotechnical drilling operations yield only a limited number of undisturbed samples (UDS).Empirical equations offer a solution to reduce the cost, but the application of empirical equations to different sites can be questionable (Al-Taie et al., 2017).This paper explores a modern alternative named artificial neural networks (ANNs), and the utilization of ANNs is significantly increased in geotechnical engineering with success (Shahin et al. 2001).ANNs are superior in modeling complex relationships, making it an ideal tool for conditions which the variables' connections are elusive (Hubick, 1992).
This study aims to create a model to predict compression index (Cc) using soil properties gathered from 347 undisturbed soil samples in Northern Jakarta.The model's performance is subsequently assessed using an additional 117 undisturbed soil samples, allowing for a comparative evaluation against other empirical equations.

COMPRESSION INDEX
The soil settlement calculation under structural load is the most important aspect besides checking the integrity of the structure.Settlement is defined as the decrease in soil volume due to water flowing out of soil and particle re-arrangement under the effect of applied pressure.Compression index (Cc) is one of the parameters that is used to calculate settlement.High Cc value means the soil is more compressive (Dwivedi et al., 2016a).
Oedometer test takes a lot of time, requires precision, precautions, and expertise in the process.Therefore, it is very tough to get ideal value of Cc.Furthermore, even a very small disturbance in the process can lead to overestimation or underestimation Cc value.Therefore, correlations are developed to limit time disadvantages of getting oedometer test results.Two popular correlations will also be compared for this study:  = 0.009( − 10) (Terzaghi & Peck, 1967) (1)  = 0.2343 ×  (Nagaraj & Murty, 1985) (2) Although empirical correlations can provide a quick and inexpensive way to estimate soil parameters with simple tests, most of these correlations are derived from fitting data measurement made under specific site condition.This may cause large deviations when used at other sites (Dehghanian & Ipek, 2022) 3 ARTIFICIAL NEURAL NETWORK MODEL 3.1 Artificial Neural Networks (ANNs) ANNs are artificial adaptive systems that are inspired by the functioning processes of human brain and nervous system (Grossi & Buscema, 2008).ANNs provide strong solutions to problems in several areas, including classification, prediction, filtering, optimization, pattern recognition, and function approximation.
The biological nervous system is extremely complicated; artificial neural networks algorithms seek to simplify this complexity and focus on what may theoretically matter most from an informationprocessing standpoint (Thakur & Konde, 2021).A comprehensive description of ANNs is beyond the scope of this paper.Many authors have described the structure and operation of ANNs (e.g., Hecht-Nielsen 1990;Maren et al. 1990;Zurada 1992;Fausett 1994;Ripley 1996).
The fundamental of artificial neural networks (ANNs) lies on their constituent elements which is the artificial neurons or processing elements (PEs).These PEs operate on a simple mathematical model defined by three basic rules consist of multiplication, summation, and activation.Within the artificial neuron, the input values are subject to weighting, whereas each input is multiplied by an associated weight.In the core of the artificial neuron resides a summation function that aggregates all the weighted inputs along with a bias term.Finally, the cumulative result of the weighted inputs and bias undergoes an activation process, often referred to as a transfer function (Andrej et al., 2011) at the output of the artificial neuron.Figure 1 illustrates the operational principle of an artificial neuron.While the foundational principles governing this operation may appear deceptively simple, the true power and computational prowess of these models emerge when we begin to interconnect them within an artificial neural network (ANNs), as depicted in Error!Reference source not found..This interconnectedness generates the ability of ANNs to modify their own connections over time, thereby initiating a learning process that characterizes the entire ANN (Hebb, 1949).This process of connection modification is often referred to as the 'Law of Learning.'Notably, the dynamism of an ANN is essentially linked to time.To facilitate the modification of connections, the ANN requires continuous interaction with its environment, typically represented by data, over extended periods (Rosenblatt, 1958).This learning process is a key mechanism that defines ANNs as adaptive processing systems.
Neurons can be organized in any topological manner (e.g., one-or two-dimensional layers, threedimensional blocks, or more-dimensional structures), depending on the quality and amount of input data.The most common ANNs are composed in a so-called forward topology (Wasserman, 1989;Aleksandar & Morton, 1990).Therefore, type of ANNs also adopted in this study.A certain number of PEs is combined to an input layer, normally depending on the amount of input variables.The information is forwarded to one or more hidden layers working within the ANN.The output layer, as the last element of this structure, provides the result.The output layer contains one PE only, regardless the result is a binary value or a single number.Figure 2 represents the most popular architecture of neural networks; forward propagation (Fahlman, 1988;Le Cun, 1989).
Figure 2 Forward Propagation neural network architecture.

Activation Function
Activation functions, also known as transfer functions, are instrumental in artificial neural networks.They convert input signals into outputs, which then serve as inputs for subsequent layers.Net inputs, central to the network's structure, undergo transformation into unit activations through these functions, constituting a scalar-to-scalar conversion (Sharma et al., 2020).
The Leaky Rectified Linear Unit is employed as activation function for this study.Leaky ReLU is defined as: Where  is the coefficient of i-th channel for negative inputs, and it is a small constant, typically 0.01 or so.Its output is not 0 for negative inputs, so it is the improvement of ReLU function for the problem "Dying ReLU" which encountered when ReLU is employed.It is straightforward for us to get its derivative.Figure 3 presents the function curve and its derivative curve.Because it can produce a constant times input value for negative inputs, compare to ReLU, where ReLU will not saturate for both directions (Feng & Lu, 2019).The term "back propagation" is a concise way of denoting "backward propagation of errors," and it serves as a standard procedure for training artificial neural networks (Soemartono et al. 2018).The back propagation process involves several key steps: a) Error Rate Computation: This initial step requires calculating the discrepancy between the model's output and the actual target output.b) Error Minimization: Subsequently, the process verifies whether the error has been minimized effectively.c) Weight and Bias Updates: If the error exceeds an acceptable threshold, the weights and biases are updated accordingly.This cycle continues until the error converges to a satisfactory level.d) Neural Network Model Finalization: Once the error rate falls within an acceptable range, the neural network model becomes ready for deployment, enabling it to effectively forecast data.Figure 4 presents the workflows of back propagation mechanism (Sekhar & Meghana, 2020).
In the context of facilitating the back propagation process, the gradient descent (GD) algorithm assumes a pivotal role.This iterative optimization technique, widely employed in ANNs, enables the gradual adjustment of parameters represented by θ, with the overarching objective of minimizing the error function J(θ) (Mustapha et al. 2020).Notably, the GD algorithm employs the entire dataset for each parameter update, a precision-focused approach that, while accurate, demands a substantial computational workload.
In our study, we turn to the Adaptive Moment Algorithm (Adam) to efficiently navigate the optimization landscape.Adam is a sophisticated amalgamation of two prominent optimization methods, Momentum and RMSProp (Kingma & Ba, 2015) This algorithm excels in its ability to calculate adaptive learning rates tailored to each parameter.Optimization continues until the objective function reaches its minimum, which corresponds to the smallest deviation between the predicted values and the test set values.

ANN Model Design
The dataset utilized for the training phase in this study comprised 347 undisturbed cohesive soil samples collected from multiple projects in northern bay area of Jakarta.This dataset incorporated three essential index properties as variables: specific gravity (Gs), liquid limit (LL), and plastic limit (PL).The objective is to employ these variables as inputs to predict the compression index (Cc) using Artificial Neural Networks (ANNs).ANN model was developed using Visual Basic for Application (VBA) in Microsoft Excel program.The VBA comprised algorithms as described in Chapter 2 using 8 hidden layers with 7 neurons in each hidden layer and Adam Optimizer.To assess the performance of the model, a separate test set comprising 117 samples contained similar properties as dataset were employed.Figure 5 provides a snapshot of compression index (Cc) from the dataset and test set in this study.

Training Phase
Training phase aimed to check the suitability of ANN model to predict Cc values.This was measured by comparing the overall error (δ) calculated using formula (5) between ANN model and reference equations.Figure 6 presents the calculated Cc by employing those equations compared with Cc from laboratory tests (data set).Whereas Figure 7 depicted similar features but for the created ANN model.
Notably, the bold black line represents the calculated or predicted Cc values that exactly align with the laboratory test (y = x), while the dotted line represented the trend line of results, originating from the intercept point at (0,0).It shall be noted that the features of a trend line, gradient and R-squared ( ), could not be a suitable indicator.The reason is that these values were measured relative to its own trendline and not to the y = x line; hence, it is not the best performance reflection of a model.This is proven from Figure 6 which shows that although the trendline from Nagaraj's equation model has lower gradient value and  than Terzaghi's, it has a lower overall error and becomes more suitable in this case.
Therefore, overall error () of a model, which considered all the deviation for every single data point, was adopted as the indicator to determine the suitability of a model.The lower the value means the better the accuracy of a model.
ANN model resulted δ of 38.9%.This value was lower than that of the equations, Terzaghi's and Nagaraj's, with numbers 55.9% and 43.2%, respectively.This implies that ANN model produced a better accuracy than that of equations.In addition to that, for higher Cc values (larger than 1.5), ANN model can predict better than the empirical formulas.These findings provided a solid basis to verify that ANN model is suitable to predict the Cc values.Linear (Terzaghi & Pack (1967)) Linear (Nagaraj and Murty (1985)) δ Terzaghi = 55.9% δ Nagaraj = 43.2% Figure 7 Predicted Cc from ANN model based on data set.

Testing Phase
After the training phase, the trained ANN model was employed to predict the Cc values.These values then compared with Cc from the test set (117 data).Similar method was also conducted for models based on the equations.Figure 8

CONCLUSION AND RECOMMENDATION
ANN model is proven to give a promising alternative solution to predict the value of compression index (Cc).ANN examination provides good prediction compared to existing empirical formulas.In addition, the ANN model is developed by considering the typical soil condition in particular area, in this case North Jakarta's Bay whereas the empirical formula is not suited to North Jakarta's conditions.
Further studies could improve this model's reliability by exploring other activation functions or other gradient descent algorithms.

Figure 1 .
Figure 1.Working principles of an artificial neuron (left); Example of simple ANN network (right).

Figure 4
Figure 4 Flow diagram of back propagation mechanism.

Figure 6
Figure 6 Calculated Cc from equations based on data set.
displays the results distribution for the ANN model and equations.The reference model, based on equations, resulted δ of 38.1% for Terzaghi's equation and δ of 30.5% for Nagaraj's.While the trained ANN model produced δ of 29.6%, smaller compared to the equations.This showed that the ANN model aligned with result during training phase, therefore applicable for determining Cc.

Figure 8
Figure 8 Calculated Cc compared to testing set.