Preliminaries.
Let $M$ be an $n$-dimensional smooth manifold, let $f_1,\dots, f_n:M\to\Bbb{R}$ be smooth functions and fix $p\in M$. Then, the following conditions are equivalent:
- $\{(df_1)_p, \dots, (df_n)_p\}$ forms a basis for $T_p^*M$
- $\{(df_1)_p, \dots, (df_n)_p\}$ is linearly independent in $T_p^*M$
- $\{(df_1)_p, \dots, (df_n)_p\}$ spans $T_p^*M$
- $(df_1)_p\wedge\cdots\wedge (df_n)_p\neq 0$
- there is an open neighbourhood $U$ of $p$ in $M$ such that $(U, (f_1,\dots, f_n))$ is a local coordinate chart for $M$.
The first three are equivalent by the rank-nullity theorem, and is very easy linear algebra. The equivalence with the fourth is also basic facts about wedge products. The equivalence with the last statement follows by the inverse/implicit function theorem.
Next, recall the following general fact: if $f:M\to\Bbb{R}$ is a smooth function and $(U,\alpha=(x^1,\dots, x^n))$ is a coordinate chart, then on $U$, we have \begin{align} df&=\left(\frac{\partial f}{\partial x^i}\right)_{\alpha}\,dx^i, \end{align} where we have used the summation convention.
The standard warning about notation.
For completeness, let me mention that we define \begin{align} \left(\frac{\partial f}{\partial x^i}\right)_{\alpha}\equiv \frac{\partial f}{\partial x_{\alpha}^i}:=\left[\partial_i(f\circ \alpha^{-1})\right]\circ\alpha. \end{align} This is a smooth function $U\to\Bbb{R}$. The $\equiv$ means “same thing different notation”. The reason for stressing the $\alpha$ is because this depends not only on the single function $x^i\equiv x_{\alpha}^i:=\text{pr}^i\circ\alpha$, but the entire coordinate chart $\alpha$ (as you can obviously see from the RHS of the definition). But we are a lazy bunch, so we simply write things like \begin{align} df&=\frac{\partial f}{\partial x^i}\,dx^i. \end{align} So, now although it seems like the symbol $\frac{\partial f}{\partial x^i}$ depends only on $f,x^i$ and some notion of $\partial$, it actually depends on the entire coordinate chart $\alpha$. In many cases this slight abuse of notation will not cause confusion (e.g if we fix a single coordinate chart $(U,\alpha= (x^1,\dots, x^n))$, or even if we have two coordinate charts $(U,\alpha)$ and $(U,\beta=(y^1,\dots, y^n))$ but none of the functions $x^i,y^i$ are the same). But if for example it happens that $x^1=y^1$ as functions, it still need not be true that $\left(\frac{\partial f}{\partial x^1}\right)_{\alpha}=\left(\frac{\partial f}{\partial y^1}\right)_{\beta}$.
Setup and Hypotheses.
Let $\Sigma$ be an arbitrary $2$-dimensional smooth manifold and let $x,y,z:\Sigma\to\Bbb{R}$ be three smooth functions. Note, I’m using the notation $x,y,z$, but at this stage forget about $\Bbb{R}^3$; there’s no relation to “Cartesian coordinates” or anything of the like. If I wanted to, I could just write $h_1,h_2,h_3:\Sigma\to\Bbb{R}$ as the names of these three functions.
Fix a point $p\in\Sigma$ and suppose $dx_p\wedge dy_p\neq 0$, $dx_p\wedge dz_p\neq 0$ and $dy_p\wedge dz_p\neq 0$. Then by the inverse/implicit function theorem, there is an open neighbourhood $U$ of $p$ in $\Sigma$ such that $(U, (x,y))$, $(U,(x,z))$ and $(U, (y,z))$ are all coordinate charts for $\Sigma$. (Simply apply the result from before three times and take the intersection of the three neighbourhoods).
We’re now going to do the same computation three different ways.
Step 1(a): $z$ and $(U, (x,y))$
Ok, so we need to start differentiating. So, we need a function $f$ and we need a coordinate chart for $\Sigma$.
We shall take $f=z:\Sigma\to\Bbb{R}$ as the function, and we shall take $(U, (x,y))$ as the coordinate chart. Then, keeping in mind the above remarks about ambiguities in the partial derivative notation, let us write things explicitly as \begin{align} dz&= \left(\frac{\partial z}{\partial x}\right)_{(x,y)}\,dx+\left(\frac{\partial z}{\partial y}\right)_{(x,y)}\,dy,\tag{$*$} \end{align} where the subscript is to remind us that this is computed relative to the coordinate chart $(U, (x,y))$.
Step 1(b): $y$ and $(U, (x,z))$
Now we take the function $f=y:\Sigma\to\Bbb{R}$ and consider the coordinate chart $(U,(x,z))$. Then, we get \begin{align} dy&= \left(\frac{\partial y}{\partial x}\right)_{(x,z)}\,dx+\left(\frac{\partial y}{\partial z}\right)_{(x,z)}\,dz.\tag{$**$} \end{align}
Step 1(c): combining steps (a) and (b).
Now, we plug in the formula for $dy$ from $(**)$ into the RHS of $(*)$: \begin{align} dz&= \left(\frac{\partial z}{\partial x}\right)_{(x,y)}\,dx+\left(\frac{\partial z}{\partial y}\right)_{(x,y)}\cdot\left[ \left(\frac{\partial y}{\partial x}\right)_{(x,z)}\,dx+\left(\frac{\partial y}{\partial z}\right)_{(x,z)}\,dz\right]\\ &=\left[\left(\frac{\partial z}{\partial x}\right)_{(x,y)}+ \left(\frac{\partial z}{\partial y}\right)_{(x,y)} \left(\frac{\partial y}{\partial x}\right)_{(x,z)}\right]\,dx + \left(\frac{\partial z}{\partial y}\right)_{(x,y)} \left(\frac{\partial y}{\partial z}\right)_{(x,z)}\,dz. \end{align} Finally, since $\{dx,dz\}$ are linearly independent (on $U$), we can simply equate coefficients to obtain: \begin{align} \begin{cases} \left(\frac{\partial z}{\partial x}\right)_{(x,y)}+ \left(\frac{\partial z}{\partial y}\right)_{(x,y)} \left(\frac{\partial y}{\partial x}\right)_{(x,z)}&=0\\ \left(\frac{\partial z}{\partial y}\right)_{(x,y)} \left(\frac{\partial y}{\partial z}\right)_{(x,z)}&=1. \end{cases} \tag{!} \end{align}
Steps 2, 3, and concluding.
Now, we simply repeat the above process two more times. You will arrive at analogous sets of equations (!!) and (!!!). In particular, one of them will say that \begin{align} \left(\frac{\partial z}{\partial x}\right)_{(x,y)} \left(\frac{\partial x}{\partial z}\right)_{(y,z)}&=1, \end{align} which implies that \begin{align} \left(\frac{\partial z}{\partial x}\right)_{(x,y)}&=\frac{1}{\left(\frac{\partial x}{\partial z}\right)_{(y,z)}}. \end{align} Finally, plugging this into the first line of (!), and rearranging yields the desired result \begin{align} \left(\frac{\partial x}{\partial z}\right)_{(y,z)} \left(\frac{\partial z}{\partial y}\right)_{(x,y)} \left(\frac{\partial y}{\partial x}\right)_{(x,z)}&=-1,\tag{$\ddot{\smile}$} \end{align} or more succinctly just written as \begin{align} \frac{\partial x}{\partial z} \frac{\partial z}{\partial y} \frac{\partial y}{\partial x}=-1. \end{align}
Further Remarks.
Once again, I’ll reiterate that $x,y,z:\Sigma\to\Bbb{R}$ are mostly arbitrary smooth functions (with some rank conditions on pairs of their derivatives). In this situation, we can play lots of games regarding who is the function $f$ vs who is the coordinate chart $(U, (x^1,\dots, x^n))$. Exploiting this to the fullest extent gives us several relationships between the various partials, the above triple product identity being one of them.
Another thing I’ll mention (as I’ve said here and elsewhere) is that a function $f:M\to\Bbb{R}$ is not “a function of” anything. It is only when we fix a coordinate chart $(U,\alpha)$ that we get an induced function $f\circ\alpha^{-1}:\alpha[U]\to\Bbb{R}$. If we now decide to give further names $x^1,\dots, x^n$ to the component functions of $\alpha$ that we say things like “$f$ is a function of $x^1,\dots, x^n$” and write things like $f=f(x^1,\dots, x^n)$. But really this is all just unnecessary vocabulary (though often convenient), and will probably distract you from the underlying concept of functions vs coordinate charts and their interactions (especially in this case).
Finally, I’ll mention that I could have rewritten most of this answer to avoid the differential geometry language, at least if $\Sigma$ is embedded in $\Bbb{R}^3$, so it’s not like you have to know what differential forms are (but one must know the inverse and implicit function theorems). You’ll just have to give tons of different names for various functions (at the local coordinate level) differing by composition of a diffeomorphism (see my answer to this thermodynamics question on PhySE for a brief indication of how this would go).
Edit: A higher dimensional case, different proof, and a purely real-analysis phrasing.
Let $A\subset\Bbb{R}^n$ be open, with $n\geq 2$, $f:A\to\Bbb{R}$ a smooth function, and assume that $\partial_1f,\dots, \partial_nf$ are all nowhere-vanishing. This sounds like a strong assumption, but we’re going to be making statements about derivatives, so really if you make this assumption at one point, then by continuity of the partials this holds in a neighbourhood; we can then work in this neighbourhood.
With this assumption, and several rounds of applying the implicit function theorem, we can shrink the open set $A$ further such that for each $i\in\{1,\dots, n\}$, there is a smooth function $\phi_i:W_i\subset\Bbb{R}^{n-1}\to\Bbb{R}$ such that up to permutation of the variables, the graph of $\phi_i$ equals $\Sigma:=f^{-1}(\{0\})$. In other words, $x=(x_1,\dots, x_n)\in \Sigma$ if and only if $f(x)=0$, if and only if $x_i=\phi_i(x_1,\dots, x_{i-1}, x_{i+1},\dots, x_n)$ (i.e we have “solved” for $x_i$ as a function $\phi_i$ of of the remaining variables).
In particular, for all $(x_1,\dots, x_{i-1}, x_{i+1},\dots, x_n)\in W_i$, we have \begin{align} f(x_1,\dots, x_{i-1}, \phi_i(x_1,\dots, \widehat{x_i},\dots, x_n), x_{i+1},\dots, x_n)=0. \end{align} Fix any $j\neq i$. Then, by taking $\partial_j$ of both sides, we have \begin{align} (\partial_jf) + (\partial_if)\cdot \frac{\partial\phi_i}{\partial x_j}&=0. \end{align} Here, I’m sorry, but I had to put in the notation $\frac{\partial\phi_i}{\partial x_j}$, because I want to differentiate $\phi_i$ with respect to the slot in which $x_j$ appears. Since I wrote the arguments as $\phi_i(x_1,\dots, x_{i-1},x_{i+1},\dots, x_n)$, we have that if $j<i$ then this really is $\partial_j\phi_i$. Unfortunately if $j>i$, then $x_j$ actually appears in slot $j-1$ of $\phi_i$ so it should technically be $\partial_{j-1}\phi_i$.
Of course, the partials of $f$ and $\phi_i$ must be evaluated at the correct points. So, rearranging, we get \begin{align} \frac{\partial\phi_i}{\partial x_j}&=-\frac{\partial_jf}{\partial_if}. \end{align} This is of course the usual implicit differentiation formula. Now, take $j=i+1$ and take the product over all $i\in\{1,\dots, n\}$ (with the understanding that indices are considered modulo $n$, so for example, index $n+3$ should be considered the same as index $3$): \begin{align} \prod_{i=1}^n\frac{\partial\phi_i}{\partial x_{i+1}}&=\prod_{i=1}^n-\frac{\partial_{i+1}f}{\partial_if}=(-1)^n, \end{align} since the partials on the right cancel out term by term (again, everything needs to be evaluated at the correct respective points). By the usual abuse of notation of not writing out the entire coordinate chart, this would be expressed as \begin{align} \frac{\partial x_1}{\partial x_2}\cdot\frac{\partial x_2}{\partial x_3}\cdots \frac{\partial x_{n-1}}{\partial x_n}\frac{\partial x_n}{\partial x_1}&=(-1)^n. \end{align}
The manifold generalization: an outline
Let $\Sigma$ be an $(n-1)$-dimensional smooth manifold, with $n\geq 2$, and let $x^1,\dots, x^n:\Sigma\to\Bbb{R}$ be smooth functions such that for each $i\in\{1,\dots, n\}$, the collection $\alpha_i= (x^1,\dots, x^{i-1}, x^{i+1},\dots, x^n)$ forms a (global) coordinate chart for $\Sigma$. Again, we can make a linear independence assumption at one point, apply the inverse/implicit function theorem, then shrink neighbourhoods sufficiently, so this isn’t really a strong assumption.
Then, we have \begin{align} \left(\frac{\partial x^1}{\partial x^{2}}\right)_{\alpha_1} \left(\frac{\partial x^2}{\partial x^{3}}\right)_{\alpha_2}\cdots \left(\frac{\partial x^{n-1}}{\partial x^{n}}\right)_{\alpha_{n-1}} \left(\frac{\partial x^n}{\partial x^{1}}\right)_{\alpha_n}&= (-1)^n. \end{align} You can prove this using either of the two approaches I suggested before.
The generalization of (!) from before is as follows: for all distinct $i,j\in\{1,\dots, n\}$ and all $\lambda\in\{1,\dots, n\}\setminus\{i,j\}$, \begin{align} \begin{cases} \left(\frac{\partial x^i}{\partial x^j}\right)_{\alpha_i} \left(\frac{\partial x^j}{\partial x^i}\right)_{\alpha_j}&= 1\\ \left(\frac{\partial x^i}{\partial x^{\lambda}}\right)_{\alpha_i}+ \left(\frac{\partial x^i}{\partial x^j}\right)_{\alpha_i} \left(\frac{\partial x^j}{\partial x^{\lambda}}\right)_{\alpha_j}&=0. \end{cases} \end{align} No summations here. Using the second equation $(n-2)$ times, and then the first equation once, it follows that \begin{align} &\left(\frac{\partial x^1}{\partial x^{2}}\right)_{\alpha_1} \left(\frac{\partial x^2}{\partial x^{3}}\right)_{\alpha_2}\cdots \left(\frac{\partial x^{n-1}}{\partial x^{n}}\right)_{\alpha_{n-1}} \left(\frac{\partial x^n}{\partial x^{1}}\right)_{\alpha_n}\\ =& (-1)^{n-2} \left(\frac{\partial x^1}{\partial x^{2}}\right)_{\alpha_1} \left(\frac{\partial x^2}{\partial x^{1}}\right)_{\alpha_2}\\ =& (-1)^{n-2}\cdot 1\\ =&(-1)^n. \end{align}
Alternatively, you could reduce to the level-set case by finding a suitable $n$-dimensional manifold $M$, an embedding $\Sigma\hookrightarrow M$ and a function $f$ whose regular level set is $\Sigma$ (and functions $\xi^1,\dots, \xi^n:M\to\Bbb{R}$ which form a global chart for $M$ and which pullback to $x^i$ under the embedding). For example, send $\Sigma$ to $\alpha_n[\Sigma]$ and send this to the graph of $x^n\circ\alpha_n^{-1}$ in $\Bbb{R}^n=M$ (then we can take $\xi^1,\dots,\xi^n$ to be the standard Cartesian coordinates on $M=\Bbb{R}^n$). I’ll leave the details to you.
Edit #2: Another way to think about it: more generalities and more functions
While I’m on this topic, I figured I might as well show another way of proving this relation. I think this might be the algebraically cleanest way of doing things. The versions above might seem to indicate that it is the miraculous properties of $1$-forms which makes things work out. But this isn’t really the case; rather it is about the simple manner in which the top-degree forms behave. The space of top-degree forms is at each point a $1$-dimensional real vector space. So, we shall begin with some trivial facts about 1D vector spaces.
Definition/Lemma.
Let $V$ be a $1$-dimensional vector space over a field $\Bbb{F}$. Then, for each $x,\xi\in V$ with $\xi\neq 0$, there is a unique $c\in\Bbb{F}$ such that $x=c\xi$. We shall denote this unique number as $\frac{x}{\xi}$. This fraction notation is convenient and justified by the following rules:
for all $c\in\Bbb{F}$ and $x,\xi\in V$ with $\xi\neq 0$, we have $c\cdot \frac{x}{\xi}= \frac{cx}{\xi}$.
for all $x,y,\xi\in V$ with $\xi\neq 0$, we have $\frac{x}{\xi}\cdot y=x\cdot \frac{y}{\xi}$
for all $x,y\in V$ and $\xi,\eta\in V\setminus\{0\}$, we have $\frac{x}{\xi}\cdot\frac{y}{\eta}=\frac{x}{\eta}\cdot\frac{y}{\xi}$.
To prove this, note that the existence of a unique number is simply because $V$ is 1-dimensional, so the singleton $\{\xi\}$ forms a basis. For the first bullet point, we have by the basic rules for manipulating scalars and vectors, \begin{align} cx= c\cdot \left(\frac{x}{\xi}\cdot \xi\right)=\left(c\cdot \frac{x}{\xi}\right)\cdot \xi. \end{align} In other words, the number $c\cdot \frac{x}{\xi}$ satisfies the same defining property as $\frac{cx}{\xi}$, hence they must be equal. The second bullet point is just as easily proved. The third is an easy application of the first two.
As mentioned above, 1-dimensional vector spaces occur very naturally in differential geometry, as the (fibers of the) top exterior power of the cotangent bundle of a manifold. Also, here, the determinant comes up very naturally of course. The following is a very easy computation, but I state it explicitly so we don’t get confused in our present setting where we have various coordinate charts and functions belonging to both charts simultaneously:
Lemma.
Let $M$ be a smooth $n$-dimensional manifold, $p,q\geq 0$ integers such that $n=p+q$. Suppose we have have a coordinate chart $\alpha= (x^1,\dots, x^p, y^1, \dots, y^q)$ and we have functions $f^1,\dots, f^p:M\to\Bbb{R}$. Then, \begin{align} \frac{df^1\wedge\cdots \wedge df^p\wedge dy^1\wedge\cdots \wedge dy^q}{dx^1\wedge\cdots \wedge dx^p\wedge dy^1\wedge\cdots \wedge dy^q}= \det \left(\frac{\partial f}{\partial x}\right)_{\alpha}. \end{align}
Note that $\alpha$ is a coordinate chart, so the $n$-form in the denominator is actually nowhere-vanishing, so this quotient does indeed make sense by the previous lemma. The proof of this is now an easy computation. We know that the ratio is the determinant of the derivative of $(f,y)$ with respect to $(x,y)$. This is a special type of block matrix, so the resulting computation is easy: \begin{align} \frac{df^1\wedge\cdots \wedge df^p\wedge dy^1\wedge\cdots \wedge dy^q}{dx^1\wedge\cdots \wedge dx^p\wedge dy^1\wedge\cdots \wedge dy^q} &=\det \begin{pmatrix} \left(\frac{\partial f}{\partial x}\right)_{\alpha} & \left(\frac{\partial f}{\partial y}\right)_{\alpha}\\ \left(\frac{\partial y}{\partial x}\right)_{\alpha} & \left(\frac{\partial y}{\partial y}\right)_{\alpha} \end{pmatrix}\\ &=\det \begin{pmatrix} \left(\frac{\partial f}{\partial x}\right)_{\alpha} & \left(\frac{\partial f}{\partial y}\right)_{\alpha}\\ 0 & I_q \end{pmatrix}\\ &=\det \left(\frac{\partial f}{\partial x}\right)_{\alpha}. \end{align} The second row of the block matrix simplifies in that manner because we’re taking the derivatives of $y$ in the $\alpha= (x,y)$ coordinate chart, so things behave exactly as we would expect. This completes the proof.
Examples.
First, let us revist our good old example with $\Sigma$ being $2$-dimensional and $x,y,z:\Sigma\to\Bbb{R}$ being smooth functions such that each pair forms a coordinate chart. Then, \begin{align} \left(\frac{\partial x}{\partial z}\right)_{(y,z)} \left(\frac{\partial z}{\partial y}\right)_{(x,y)} \left(\frac{\partial y}{\partial x}\right)_{(x,z)} &=\frac{dx\wedge dz}{dy\wedge dz}\cdot \frac{dy\wedge dx}{dz\wedge dx}\cdot \frac{dz\wedge dy}{dx\wedge dy}\\ &=\frac{dx\wedge dz}{dz\wedge dx}\cdot \frac{dy\wedge dx}{dx\wedge dy}\cdot \frac{dz\wedge dy}{dy\wedge dz}\\ &= (-1)\cdot (-1)\cdot (-1)\\ &=-1. \end{align} Note that we have repeatedly used the previous two lemmas, and the fact that the wedge product is alternating.
As another example, let $M$ be a $3$-dimensional smooth manifold, and let $x,y,z,u,v,w:M\to\Bbb{R}$ be smooth functions such that each triple forms a coordinate chart. Now, define
- $J_1= \frac{\partial x}{\partial u}\frac{\partial y}{\partial v}-\frac{\partial x}{\partial v}\frac{\partial y}{\partial u}$, where the derivatives are computed relative to the $(z,u,v)$ coordinate chart.
- $J_2= \frac{\partial x}{\partial w}$ where the derivatives are computed relative to the $(z,u,w)$ coordinate chart.
- $J_3= \frac{\partial u}{\partial y}\frac{\partial w}{\partial x}-\frac{\partial u}{\partial x}\frac{\partial w}{\partial y}$, where the derivatives are computed relative to the $(x,y,z)$ coordinate chart.
That’s a whole lot of crazy partials, and what if I now ask for the product $J_1J_2J_3$, is there a simple formula? Yes: \begin{align} J_1J_2J_3&=\frac{dx\wedge dy\wedge dz}{du\wedge dv\wedge dz}\cdot \frac{dx\wedge dz\wedge du}{dw\wedge dz\wedge du}\cdot \frac{du\wedge dw\wedge dz}{dy\wedge dx\wedge dz}\\ &=\frac{dx\wedge dy\wedge dz}{dy\wedge dx\wedge dz}\cdot \frac{dx\wedge dz\wedge du}{du\wedge dv\wedge dz}\cdot \frac{du\wedge dw\wedge dz}{dw\wedge dz\wedge du}\\ &=(-1)\cdot (-1)^2\left(\frac{\partial x}{\partial v}\right)_{(z,u,v)}\cdot (-1)^2\\ &= -\left(\frac{\partial x}{\partial v}\right)_{(z,u,v)}. \end{align} Things simplified because I tailor made these functions to make things work out (assuming there are no typos).
In this manner, you can easily come up with many more such relations between (determinants of) partial derivatives of functions with respect to various coordinate systems.