Determining the Principal Azimuths of Valles Marineris: A Comparative Analysis of Bezier Spline and Geographic Information System Techniques

Dan James

doi:10 .4172/2327-4581.1000411

Research Article, Geoinfor Geostat An Overview Vol: 12 Issue: 6

Determining the Principal Azimuths of Valles Marineris: A Comparative Analysis of Bezier Spline and Geographic Information System Techniques

Dan James^*

¹Department of Philosophy of Science, University of Exeter, Exeter, England

*Corresponding Author: Dan James,
Department of Philosophy of Science, University of Exeter, Exeter, England
E-mail: danjamesdeveloper@gmail.com

Received date: 16 October, 2024, Manuscript No. GIGS-24-150239;

Editor Assigned date: 18 October, 2024, PreQC No. GIGS-24-150239 (PQ);

Reviewed date: 01 November, 2024, QC No. GIGS-24-150239;

Revised date: 08 November, 2024, Manuscript No. GIGS-24-150239 (R);

Published date: 15 November, 2024, DOI: 10.4172/2327-4581.1000419

Citation: James D (2024) Determining the Principal Azimuths of Valles Marineris: A Comparative Analysis of Bezier Spline and Geographic Information System Techniques. Geoinfor Geostat An Overview.12:6.

Abstract

Valles Marineris (VM), the most prominent example of a Martian Valley Network, has been extensively studied for over 50 years, yet no detailed examination of the principal azimuths of the system exists. To address this, two methods are presented to precisely calculate the principal azimuths, a Bezier Spline analysis and a Geographic Information System (GIS) technique. The medial axis of the main canyon of VM was determined analytically from cubic polynomial splines fitted to 93 coordinate points along both north and south edges of the canyon. These splines were optimized and medial axis points were calculated through numerical techniques that ensured orthogonality between the tangents of each spline and their connecting normal lines. 1,000 medial axis points were extracted and various regression models constructed, including fitting to sinusoidal and cubic polynomial curves, achieving accuracies with R2 values of 0.98 and 0.99, respectively. Principal azimuths were obtained using the sinusoidal equation with the slope of the tangent at any point x simply determined by the derivative of the curve’s equation. This analytic approach was cross-validated by a GIS method (using QGIS software), where a vector medial axis was obtained which produced principal azimuths that agreed with values from the analytic study with a correlation coefficient of 1.00 and a p value of 6.43e-65. The findings demonstrate that an azimuthal framework can be rigorously constructed as a potential standard reference in VM geoscience, replacing less precise and ambiguous compass bearings with the accurate azimuths necessary for high-resolution spatial analysis for future investigations.

Download PDF

Introduction

There is conjecture regarding the influence of water in shaping Valles Marineris (VM), the most prominent example of a martian valley network and interpreted as good evidence of prolonged surface water on Mars [1-3]. Although the dominant hypothesis attributes the formation of VM to tectonics, megashears, subsidence, trough collapse, or a combination [4-7].

Previous orientation research has focused on determining azimuths of localised individual features within the VM system, for example dike placement in eastern Coprates Chasma, however, despite extensive investigations over 50 years, no survey has to date established the principal azimuths of the VM system, which are essential for a precise understanding the directional forces that have shaped the region [4].

Medial axis bezier spline analysis

This study aims to fill this gap by using a medial axis bezier spline analysis, a method commonly employed in geomorphological modeling for delineating central trajectories and cross-validating this approach with a more conventional, algorithmic GIS technique to determine the principal azimuths of the main canyon [8]. These two analysis methods are rigorously compared to provide statistical evidence confirming the accuracy and reliability of the derived medial axis and azimuthal orientations.

Main canyon definition and Coordinate Reference Systems (CRS)

The VM canyon system has distinct morphological complexity with multiple subsidiary canyons, which are ancillary to the main canyon running parallel or subparallel to it, contributing significantly to the overall topology of the system. In this study, the term ‘main canyon’ is defined as the most pronounced geomorphological feature, trending west to east from Noctis Labyrinth in the west from 15,900 km longitude to Eos Chaos in the east at 18,700 km (Figure 1).

Figure 1: Shaded area represents the main canyon of the VM. The North and South Canyon edges are labelled and marked by white lines. Melas Chasma is at the midpoint of the main canyon and the approximate starting point of the main canyon is at 15,900 km in the west, terminating at approximately 18,700 km in the east.

While the main canyon’s medial axis, as discussed in this paper, is not the midline of the collective features for the entire Valles Marineris system, establishing the axial symmetry of the main canyon necessarily preserves congruent symmetry in any parallel or subparallel subsidiary canyon.

Reference images were obtained from the Thermal Emission Imaging System (THEMIS) daytime IR mosaic (100 m/pixel) and Java Mission-planning and Analysis for Remote Sensing (JMARS), with an analytical GIS primarily via ArcMapTM. Equidistant Cylindrical Projection was used to preserve accurate distances along meridians via a Coordinate Reference Systems (CRS) base: ESRI:103885 (Mars Equidistant 2000 Sphere). Accurate ellipsoid parameters were added with custom code: semimajor axis: 3396.2, semiminor axis: 3376.2.

Defining edge splines

Various methods exist to determine the medial axis of an object, such as Voronoi skeletons, distance transform methods and thinning algorithms such as wave propagation via grassfire algorithms. In this study, a Bezier curve approximation technique was used where univariate spline curves were fitted to the North and South Canyon edges of VMs via the UnivariateSpline function from Python’s SciPy library [9-12]. A total of 93 coordinate points were identified on both North and South Canyon edges, with all points evenly spaced and with longitudinal distances preserved within the Cartesian kilometer-based CRS (ESRI:103885).

The edge splines of the main canyon were constructed with parametric cubic spline continuous functions, ensuring first and second derivative continuity and avoiding sharp changes in the curve’s slope or curvature, which are common in lower-degree splines. Denoting the splines for the north and south edges as N(t) and S(t), where t is the parameter along the spline or arc length, the spline functions for each canyon edge in a 2D plane can be written as:

Where:

• B_i,3 (t) and B_j,3 (t) are the cubic B-spline basis functions and are defined over a knot vector that is locally supported.

• n and m are the numbers of control points for the North and South edge splines, respectively, determined through the spline fitting process.

• The interval [a,b] represents the domain of the parameter t.

• P_1i and P_2jare the control points for the north and south edge splines and define the initial shape and direction of the spline, which is then adjusted via the optimization process. While control points do not generally lie on the curve, they act as weights in the combination of basic functions that define the curve.

Spline optimization

Univariate spline interpolation involves a piecewise-defined cubic polynomial function that fits a set of data points in a single variable, with knots segmenting the dataset and individual polynomial functions fitted within each segment. The placement of knots and the overall smoothness of the curve in the Univariate Spline algorithm are regulated by a key parameter: The smoothing factor (s).

When s is unspecified, Univariate Spline defaults to place knots at every data point, effectively interpolating the data. However, specifying a value for s balances the spline’s smoothness with its adherence to the data by a least-squares optimization process, where the spline coefficients are calculated to minimize the sum of squared residuals between the fitted spline and the data points, subject to the smoothness constraint imposed by s. Given a set of data points (x_i, y_i), where i=1, 2…, N and a spline function S(x), the goal is to find the spline that minimizes the following objective function:

Where:

• The sum of the squared residuals, representing the fit of the spline S(x) to the data points (x_i, y_i), (lower values indicate a better fit) is added to the integral of the square of the second derivative of the spline, which acts as a measure of the spline’s smoothness. A smoother curve will have a lower value of this integral.

• λ represents the smoothing factor s in Univariate Spline and is a parameter that balances the two objectives Smoothness of the spline and proximity to the data points.

This optimization problem is solved in Univariate Spline via numerical methods, as curve fitting problems typically lack closed-form solutions. With the user-defined s value constraint, the algorithm iteratively adjusts the knots positions and the coefficients of the spline to minimize J, capturing the essential trends of the coordinate data while reduce overfitting.

Canyon edge spline analysis

The Sum of the Squared Residuals (SSR) conceptually distinct, has an important relationship with the smoothing factor s in the Univariate Spline algorithm, which explains the high SSR value shown (Figure 2).

Figure 2: Spline fitted to the North Canyon and South Canyon edge showing 93 coordinate positions and Knot positions. Note: (a) The smoothing factor used was 5,000, resulting in an SSR of 5,000; (b) The smoothing factor used was 5,000, resulting in an SSR of 5,000. The South Canyon edge with its higher variability has more generated knots and control points than the North Canyon edge.

The smoothing factor serves as an upper bound for the SSR, guiding the algorithm to adjust the spline’s smoothness such that the SSR approaches but does not exceed s. Consequently, the final SSR of a spline often equals the value set for s. Canyon edges, which span >3,000 km, exhibit significant variability due to landslide extensions and subsidiary canyon intrusion, leading to potentially large residuals, cumulatively resulting in a higher SSR.

A higher smoothing factor is therefore required for the canyon edge data, enabling the spline to adequately curve to the coordinate data’s underlying trends and allowing for this higher SSR without being excessively constrained by variability. SSR is a measure of the total deviation of the response values from the fit to the response values and while a lower SSR may indicate a better fit, it is not a guide to how good the fit is relative to the spread and variability of the data, whereas the R² value, the coefficient of determination, indicates how well the independent variable explains the variability of the dependent variable by calculating the ratio of SSR to SST (total sum of squares)–R²=1−SSR/SST¹.

Both edge splines, which are based on a smoothing factor of 5,000 (SSR 5,000), returned R² values close to 1, implying that a significant proportion of the total variation in the data is captured and explained by the spline models with minimal deviation from the actual data points. The Root Mean Square Error (RMSE) for the charts shown was 7.33, which, in the context of the canyon edge data measured in kilometers, represents 7.33 km a mean error equivalent of <0.24%, which is low when contrasted against the vast overall length of the canyon edges (>3,000 km). To assess how different values of the smoothing factor affect the fit, numbers of knots and control points, several iterations were carried out for each canyon edge (Tables 1 and 2).

North Canyon Edge–Coordinate Points=93
SSR/Smoothing factor	Knots	Control Points	R²	RMSE
10	84	89	1	0.328
100	71	78	1	1.036
1000	40	49	0.9996	3.277
5000	15	21	0.9978	7.332
10,000	8	14	0.9956	10.369

Table 1: Different values of the smoothing factor affect the fit, numbers of knots and control points, several iterations for North Canyon Edge.

South Canyon Edge –Coordinate Points = 93
SSR/smoothing factor	Knots	Control Points	R²	RMSE
10	85	91	1	0.3279
100	72	79	1	1.037
1000	47	55	0.9997	3.277
5000	31	37	0.9986	7.332
10,000	15	21	0.9972	10.368
1 In UnivariateSpline documentation, the SSR is only required to be ≤ s, so for some splines, it can theoretically be less. For all splines detailed in this paper it was equal.

Table 2: Different values of the smoothing factor affect the fit, numbers of knots and control points, several iterations for South Canyon Edge.

Medial axis extraction

The medial axis spline is a path equidistant from two edge splines at any point along its length. If the medial axis spline is denoted by M(t), a point M(t_i)=(x_m(t), y_m(t)) on this spline should satisfy the following condition where the distance from any point on the medial axis spline to the nearest point on each edge spline is equal:

Finding midpoints between two splines by simply averaging the distance on a vertical bisector so that Midy=(North_y+ South_y)/2 in the context of VM is not an optimal method for finding the medial points. The VM system trends ≈ 20° from northwest to southeast, curving in multiple ways therefore line segments joining the canyon edge splines requires orthogonality to fully capture the curvature and direction of the splines.

The goal is to find an equal number of points along each of the canyon edge splines where lines joining pairs of points are perpendicular (normal) to both splines at their respective points of tangency and then find the midpoints M(t_i) of these lines. This multidimensional optimization problem has a 2D parameter space that is required to optimize two variables (t_north) and (t_south), which are parameters along North and South splines at intersection points with a line segment between both splines. The optimization sequence can be written as:

Smoothing: Apply s=5000 to both North and South splines.

Point pair identification: Generate a sequence of point pairs N(t_i), S(t_i) for i=1, 2…, n along the entire length of the North and South splines.

Slope threshold: Calculate the slope at each point on the southern spline as the first derivative of the spline at each point, denoted as Slope S(t_i). Define a threshold value for the slope as slope threshold.

Conditional processing: For each point pair {N (t_i), S(t_i)}, check the slope at S(t_i). If slope S(t_i)<slope threshold, proceed with subsequent steps (5-10). Else skip to the next point pair {N(t_i), S(t_i)} and proceed with subsequent steps (5 to 10).

Vector: Define L as the vector connecting a point on the North spline at (t_north) to a corresponding point on the South spline at (t_south) expressed as L=S(t_south)−N(t_north).

Tangent vectors: Calculate the tangent vectors at t_north and t_south as TN (t_north) and TS (t_south), respectively.

The perpendicularity function f (t_north, t_south) is: f (t_north, t_south)=[L * TN(t_north)]² + [ L * TS(t_south)]²……(5)

This function quantifies the squared sum of the dot products of the vector L with the vectors at TN (t_north) and TS (t_south).

Multidimensional optimization: Adjust tnorth and tsouth to minimize f(t_north, t_south) to zero and therefore the perpendicularity of L with both TN (t_north) and TS (t_south).

Midpoint calculation: Use the perpendicular bisector to calculate the midpoint M(t_i) of L, which represents the optimized medial axis point for each ti.

Iteration: Repeat the process for 1,000 points to compile the set M(t_i).

Normal line equivalence at points (t_north) and (t_south) requires the tangents of the curve at each point to be parallel. For the original dataset of the Canyon edge coordinate points, this condition is potentially difficult to meet, as there may be significant direction changes in the Canyon edges.

However, the smoothing factor used in the first step of the above procedure helps to regularize the data by reducing noise and minor fluctuations and facilitates the multidimensional optimization process by making the behavior of the spline (especially its tangents) more consistent and less prone to abrupt changes. Too much variability in the coordinate data might otherwise mislead an optimization algorithm to either prevent convergence or alternatively produce a local solution that is not globally optimal. For consistency in this study, a smoothing factor of 5,000 was used for both the North and South splines.

A specific conditionality function is also integrated into the above sequence (steps 3,4) to address the anomaly at the midpoint of the South Spline, where the region of Melas Chasma imposes pronounced southward semicircular extrusions along the canyon edge and spline representation. The steeper tangent slopes in this region potentially lead to complexities in the algorithm’s ability to resolve accurately. To avoid this, the algorithm employs a thresholdbased decision process where if the calculated slope at any given point exceeds a predetermined angle–set for this study at ± 25°-the algorithm omits processing for that particular point pair, advancing instead to the subsequent pair. Given the localized nature of the Melas Chasma steep tangent slopes, this adjustment impacts only a negligible portion of the overall spline representation.

The determination of each coordinate point M(t_i) in the saved array M(t_i) can be described more formally by optimizing the objective function J₁, which enforces equidistance and orthogonality constraints for points on the medial axis spline:

where:

• d₁(t_i) and d₂(t_i) are the distances from M(t_i) to N(t_i) and S(t_i) respectively.

• O₁(t_i) and O₂(t_i) are measures of the deviation from orthogonality at M(t_i) to N(t_i) and S(t_i) respectively. This is computed as the sum of the squares of the dot product of the vector L with the tangent vectors T_N and T_S.

• λ₁ and λ₂ are weighting parameters for the importance of equidistance and orthogonality in the objective function.

• t_i represents the parameter values at which the spline and constraints are evaluated.

• N is the number of points along the spline where calculations are made.

The array of medial axis coordinate points M(t_i) was used to construct a medial axis spline in a 2D plane via the same method as the Canyon edge splines were created via the UnivariateSpline algorithm, formally expressed as:

where:

• M(t) represents the medial axis spline.

• B_k,3(t) are the cubic B-spline basis functions for the medial axis spline.

• P_mk are the control points of the medial axis spline.

• p is the number of control points for the medial axis spline, which is determined through the spline fitting process.

• The interval [c,d] represents the domain of the parameter t for the medial axis spline.

For the medial axis spline optimization, the objective function JM balances the dual goals of fitting the medial axis spline closely to the determined medial axis points. While also maintaining a smooth curve controlled by the choice of the smoothing parameter:

where:

• (x_i,y_i) are the coordinate points that the medial axis spline is intended to fit.

• q is the number of data points.

• The sum of the squared residuals, representing the fit of the medial axis spline to the data points (x_i,y_i) is added to ∫ (M"(x))²dx, the integral of the square of the second derivative of the spline and acts as a measure of the spline’s smoothness.

• λ is the smoothing parameter that balances the trade-off between the fit to the data points and the smoothness of the spline (represented by the smoothing factor s in Univariate spline) (Figures 3-5).

Figure 3: Medial axis spline fitted to 1,000 median axis points. The smoothing factor of 5,000, resulted in 12 knots and SSR of 5,000.

Figure 4: Positions relative to a Thermal Emission Imaging System (THEMIS) daytime IR mosaic composite image of Valles Marineris of the fitted splines (North, South & Medial). The shaded area represents the extent of the VM Main Canyon.

Figure 5: Chart showing the medial axis spline based on 1,000 medial axis coordinates, without showing canyon edge splines, for evaluation of potential medial axis symmetries.

Regression analysis

Various curve fitting models were considered for fitting to the median axis coordinate data of 1000 points. A visual inspection of the medial axis plot suggested a possible sinusoidal fit to the data and an initial guess for this fit, its determination and comparison to final fit values (Table 3).

Parameter	Initial guess	Methodology for initial guess	Final fit values	%Change
Amplitude (A)	-236	A ≈ (max(y)–min (y))/2	-239.01	-1.27%
Angular frequency (ω)	-0.0018	ω ≈ 2π*(Cycles/Range of x)	-0.00128	28.90%
Phase shift (φ)	261.02	Align first peak with model	251.1	-3.80%
Vertical shift (B)	-631.57	B ≈ mean(y)	-624.05	1.20%

Table 3: Methodology for initial guess (sinusoidal).

Model sinusoidal: To optimize the sinusoidal model parameters, the iterative least-squares Levenberg–Marquardt method was used, implemented via SciPy’s ‘optimize’ and ‘curve fit’ functions. This method minimizes the residual sum of squares, a common objective function (loss function) in regression analysis. The initial parameter estimates resulted in reliable model convergence, with final values closely matching the initial guess. For comparative analysis, linear, geodesic path and cubic polynomial models were included. The fitted sinusoidal curve was determined as:

Model cubic polynomial: Cubic polynomial models are widely used in situations where a balance is needed between model complexity and the ability to accurately represent the underlying trends of the data without the risk of overfitting associated with higher degree models. The loss function for n data points can be defined as:

This loss function was minimized by a least squares algorithm implemented in Python and the fitted cubic polynomial was determined for 1000 coordinate points as:

Model linear: VM is typically described as exhibiting linear characteristics or even, more casually, as being a linear feature. To critically evaluate this assertion, a linear model was included as a reference or baseline in this analysis. This approach allows for a direct comparison with other models to demonstrate that a more complex model structure than a purely linear one, better represents the underlying data patterns.

Model geodesic: A geodesic model was chosen to evaluate any significant divergence from the geodesic trajectory by the medial axis, which would indicate influences or additional factors beyond simple distance minimization. Substantial deviations from the geodesic require a causal explanation, whether stochastic, deterministic or a combination (e.g., tectonic split in the crust, water/lava erosion, or other causal factors).

The ‘Geod’ object from the ‘pyproj’ library, which is optimized for handling latitude and longitude data for distance and angle calculations on a spherical model (Mars), was used to define the geodesic path. Longitude values were normalized to the 0-360° range in a standard positive east longitudinal system and paths were optimized to prevent wrapping around the sphere. The start and end coordinates were set at -6.9° latitude, 270° longitude and -13.8° latitude, 316° longitude, respectively.

Based on Mars’s mean radius of 3396.2 kms, the ‘Geod’ object generated 100 intermediate coordinate points along the calculated great circle route. For quantitative analysis and plotting in linear units, a conversion from latitude and longitude to kilometers was conducted via a ‘numpy’ function applying the Equidistant Cylindrical projection (ESRI:103885). The latitude and longitude were adjusted relative to a central point (defaulted to 0°) and then converted to radians multiplied by the mean radius of Mars to compute the Cartesian coordinates, where �� represents the horizontal distance and �� represents the vertical distance from the central point in a planar projection of Mars. The geodesic path, converted from original latitude/longitude coordinates to Cartesian coordinates, was modeled via a quadratic polynomial to accommodate the expected slight curvature. The best-fit polynomial, determined through least squares fitting, is given by the following equation (Table 4 and Figure 6).

Model	Log-likelihood	AIC	BIC	SD of residuals	R²	Adjusted R²	RMSE
Cubic polynomial model	-425.196	858.392	868.812	17.082	0.99	0.99	16.996
Sinusoidal model	-429.416	866.832	877.253	17.818	0.989	0.989	17.729
Linear regression model	-519.145	1042.29	1047.5	43.707	0.935	0.935	43.488
Geodesic path	-552.37	1108.74	1113.95	35.24	0.876	0.873	60.319

Table 4: Comparison of regression metrics.

Figure 6: Chart showing four Models fitted to medial axis coordinates (100).

The plot shown is not at a sufficient scale to distinguish separate sinusoidal and cubic polynomial fitted curves-there is considerable overlap. However, a clearer way of discriminating between each model can be seen, where a dataset of 100 medial axis points was examined according to a variety of standard metrics. The cubic polynomial model’s ability to adapt to more complex patterns, although numerically showing the best performance, does not provide a significant advantage, suggesting that the complexity it adds may not be necessary to capture the primary dynamics of the data, thereby tending to validate the simpler sinusoidal model.

Cubic polynomial models have more coefficients for each degree of the polynomial, allowing a fit that captures more fluctuations in the data than a sinusoidal model, which is characterized by a single frequency, amplitude and phase. The Akaike Information Criterion (AIC) and the Bayesian Information Criterion (BIC) are standard techniques for comparing models with different complexities, with the AIC penalizing complexity by adding twice the number of parameters (2000) and the BIC further multiplying the number of parameters by the logarithm of the data points (log (n)*k). However, in both cases, while complexity is penalized, a good fit is always rewarded and if the polynomial model’s complexity leads to a better fit, this can suggest improved performance despite a greater number of parameters.

Kernel density estimate for residuals

A kernel density estimation plot of all four models also reveals a close match for both sinusoidal and cubic polynomial models, with a unimodal distribution of sharp high central peaks for both indicating a high concentration of residuals near zero and therefore a good fit with minimal error. The other two comparative models, linear and geodesic path, display pronounced bimodal distributions, both have two distinct residual groups where the models are under fitting or overfitting and cannot capture the curvature or pattern in the data.

Data scaling

To evaluate model robustness and efficiency across a spectrum of data volumes, an AIC analysis was used based on a range from 25 coordinate points to 1,000 points, which were extracted from the original medial spline calculation sinusoidal curve. The linear trend shows that as the dataset size increases, the complexity of the models scales up linearly, indicating no sudden increase in complexity, which might otherwise imply overfitting or poor scalability (Figure 7).

Figure 7: KDE of the residuals and data scaling for AIC. Note: (a) KDE for the four models; (b) Data scaling for 25–1,000 coordinates compared with the AIC value.

The cubic polynomial model consistently maintains a marginal advantage over the sinusoidal model, yet both overlap substantially at the scale of the chart. Both significantly outperform the linear model used as a reference.

Azimuths of the main canyon

Determining the principal azimuths of the main canyon is important for understanding the geomorphology of Valles Marineris. By fitting a sinusoidal curve to the medial axis coordinate data, the azimuths can be efficiently calculated using the derivative of the curve, with the slope of the tangent at any point �� simply determined by the derivative of the curve’s equation. Using this method, potentially unlimited datasets of azimuths can be generated for the curve; however, only 20 azimuths were generated for display clarity on a chart. A total of 1,000 medial axis coordinates were curve fitted to a sinusoidal function to derive the best fit equation.

To find the slope y′(x) at any point, we differentiate y(x) with respect to x, using standard differentiation techniques (product rule and chain rule), the derivative of the sinusoidal function, which represents the slope at any point x, is found to be.

Assuming an initial position (x₀, y₀), the calculated slope of the tangent is converted to an angle via the arctangent function, which yields an angle θ in radians from the positive x-axis.

To convert this Cartesian angle into an azimuth (north-based bearing), the coordinate system is rotated by 90°-θ₀ and to ensure that all azimuths are positive and appropriately normalized, the azimuth is adjusted via the modulo operation: Azimuth= (azimuth+360) mod 360. For the iteration of azimuths, the change in azimuth over distance △x uses the derivative.

with the next azimuth value predicted by:

where:

• x: Independent variable representing the horizontal coordinate.

• Y(x): The sinusoidal function describing the vertical coordinate as a function of x.

• y′/(x): The derivative of y(x) represents the slope of the function at any point x.

• θ: The azimuth angle in degrees, measured clockwise from north.

• θ₀: The initial azimuth angle at the starting position.

• Δθ: The change in the azimuth angle over a distance Δx.

• Δx: A change in the horizontal coordinate x.

Implications of the azimuth calculations

By applying the method described in section 2.8, detailed azimuth data along the entire length of the main canyon can be generated. This high-resolution azimuthal information is important for:

Analyzing structural trends: Understanding how the orientation of the canyon changes provides insights into tectonic stresses and faulting patterns.

Comparing with geological features: Correlating azimuth variations with specific geological formations may reveal relationships between structural orientation and depositional or erosional processes.

Advantages of the method-efficiency: The analytical approach allows for rapid computation of azimuths without the need for extensive GIS processing.

Precision: Calculations are based on a mathematical model fitted to the data, reducing potential errors from manual measurements.

Scalability: The method can be applied to any number of points along the canyon, facilitating studies at different resolutions (Figure 8).

Figure 8: Fitted sinusoidal curve and azimuths. Note: (a) Extraction of twenty azimuth values in a Cartesian CRS; (b) Extraction of 20 azimuth values in a Geodetic CRS. Medial axis spline smoothed with an SSR of 5,000.

GIS validation

In addition to the above detailed analytic techniques, a computational GIS approach was also undertaken via QGIS software. The aim was to cross-validate the analytic results, particularly the azimuth calculations, with azimuths derived algorithmically and in an alternative CRS geodetic coordinate system. Several tools and plugins were used to automatically detect edges and vectorize. A brief description of the workflow in the QGIS is as follows:

Obtain high-resolution DEM data (JMARS) and georeferenced with CRS ESRI:107971 (Mars 2000 Sphere).

Geotrace North and South Canyon edges of the main canyon with an edge detection algorithm and appropriate threshold values (Canny Edge Detection)

Buffer each raster canyon edge, merge raster layers to vectors, extract vertices and then process with the Voronoi polygon tool.

From the vectorized Voronoi skeleton extract the medial axis line

Segment medial axis vector via a field calculator and derive azimuths for each segment.

GIS/Analytic comparison

Two sets of azimuths, one from the analytic approach (Cartesian) and one from the GIS methodology (Geodetic) were compared to assess the accuracy and correlation of both methods. 50 azimuths were extracted for each set (Table 5).

Mean summary
Metric	Geodetic azimuths	Cartesian azimuths	Comparison
Mean (degrees)	100.25°	99.07°	Geodetic mean slightly higher
Standard deviation (degrees)	7.95°	7.16°	Similar variability
Correlation metrics
Metric	Value	Interpretation
Correlation coefficient (r)	1	Perfect linear relationship
p value (correlation)	6.45E-65	Highly significant correlation
Paired t test (t-statistic)	9.59	Significant difference in means
Paired t test (p value)	7.94E-13	Highly significant difference

Table 5: Mean summary and correlation metrics for 50 azimuths which were extracted for each set.

The azimuths derived independently from the geodetic and Cartesian CRS exhibit a high degree of similarity; however, a slight difference in the mean azimuths was observed (Geodetic: 100.25°, Cartesian: 99.07°). This variation can be attributed to several factors inherent to the different methods employed. Aside from potential CRS projection transformation discrepancies (which can introduce minor errors due to the approximations used in the conversion processes), the principal way discrepancies are likely to have occurred is with the collection of data for each set.

The geodetic azimuths were calculated algorithmically via the GIS program QGIS, whereas the Cartesian azimuths were determined analytically via Bezier splines. Slight discrepancies in the edge detection of canyons and spline derivation from selected coordinate points on canyon edges affect the final azimuth calculations. However, the significant correlation between the two sets of azimuths (correlation coefficient=1.00, p value=6.43e-65), despite the methodological differences in obtaining them, indicates strong agreement between the two approaches.

Although the geometric and Cartesian azimuths differ slightly in terms of their exact values, they still follow the same general sinusoidal trend, which is reflected in the high correlation coefficient and demonstrates that the two datasets have a very strong linear relationship, i.e., effective cross-validation. The small difference in means, while statistically significant (paired t test: T statistic=9.59, p value=7.94e-13), indicates that even minor differences can be detected with high precision due to the paired nature of the data and the large sample size.

Results and Discussion

This study derived the medial axis for the main canyon of the VM via both an analytic approach with Bezier splines and an algorithmic approach with GIS software (QGIS). The results were compared statistically to assess the correlation between the two methods by comparing the azimuths produced by both methods. The statistical analysis revealed a correlation coefficient for both sets of azimuths of 1.00 and a p value of 6.43e-65. This strong correlation validates the accuracy and reliability of both approaches in mapping the medial axis of the VM.

Determining the principal azimuths for Valles Marineris introduces the possibility of standardized directional references across future VM geoscience studies. Defining orientations with precise azimuthal degrees eliminates the ambiguity and imprecision of compass bearings, aligning with contemporary geospatial good practice and paving the way for a universally applicable VM azimuth standard. Using azimuths helps minimize errors and misinterpretations caused by the vagueness of compass bearings and is essential for detailed spatial analysis, with the clarity offered by azimuths ensuring that data is analyzed and understood consistently. Adopting a precise azimuthal framework could significantly improve the reproducibility of high resolution VM research findings and further more robust cross-disciplinary collaborations.

In the regression analysis, four models were compared: Sinusoidal, cubic polynomial, linear and geodesic. The sinusoidal and cubic polynomial models provided the best fit, demonstrating significant alignment with the derived medial axis at distances greater than 3,500 km. The fit of the sinusoidal and cubic polynomial models over such a large distance is unexpected and suggests that underlying mechanisms influence the formation of the main canyon and its medial axis in a predictable manner.

The challenge lies in understanding which geological processes, such as planetary tectonics, might create patterns that align so accurately with these mathematical models. The data from the azimuthal calculations potentially support the role of large-scale tectonic forces in shaping the landscape, paralleling studies of Earth’s own tectonic systems or a possible megashear at a dichotomy boundary [7,13]. An accurate fitted curve model, particularly one that fits well over a large distance, implies a deterministic causation process, as such models typically represent regular, periodic and predictable phenomena rather than stochastic, unpredictable phenomena. A potential focus for further research is to examine the principal azimuths of the main canyon for their applicability to all features of the VM system by quantifying the exact degree of congruent symmetry or parallelism in all subsidiary canyons or features.

Limitations

The selection of coordinate data on the Canyon edges to construct Bezier splines has inherent subjectivity; however, the choice of evenly spaced x-axis coordinates (longitudes) was precisely aimed at minimizing any subjective bias.

Selection effects are also inevitably involved in the choice of where precisely the main canyon of VM originates and ends. VM trends west to east and the first identifiable main canyon edges can be seen in the west, as they emerge from the heavily fractured area of the Noctis Labyrinth at approximately 15,900 km on the x-axis in a Cartesian CRS or 265° east of a positive east reporting system in a Geodetic CRS.

The termination of the main canyon edges is less clear in the east, where large outflow channels from VM have created a poorly defined chaotic terrain at Eos Chasma and Capri Chasma, leading to Aurorae Chaos. This study takes the last well-defined main canyon edge to be at approximately 18,700 km or 320° east, after which the canyon edges become highly irregular. This easternmost point can be seen as the boundary between the main canyon and the beginning of the outflow networks that eventually lead to the northern expanse of Chryse Planitia.

Conclusion

This research presents two methodologically robust approaches to calculating the medial axis of the main canyon of Valles Marineris. A Bezier approximation spline technique coupled with multidimensional optimization was compared with an algorithmic GIS approach. Both methods are highly reproducible and yield accurate values for the medial axis coordinates. In addition, both methods can determine the principal azimuths of the main canyon of the VM and statistically show a high degree of correlation, therefore effectively cross validating each approach. Introducing the potential for standardized directional references across future VM geoscience studies. That a sinusoidal model is an accurate fit to the medial axis is an unexpected result and suggests that deterministic processes were primarily involved in the formation of Valles Marineris.