So TheQuickBrownFox is spot on when they mention the potential exception with Seq.Cast<'T>
, and I want to talk about that and introduce a couple new operators to you.
The potential problem lies in the fact that it could be possible for an item to not be castable to XmlNode
, and then we throw exceptions which is usually frowned upon in F#, we would instead want to return an error state.
One of the many operators in F# is the :?
operator, which tests if a value is castable to the specified type. (So x :? XmlNode
will test if x
is an XmlNode
.) We can thing of this as analogous to the is
operator in C#.
So I want to write a castSafe
method which will cast our object list to a type, without throwing an exception. We intend to use this as channelNode.ChildNodes |> Seq.castSafe<XmlNode>
.
The first step is to define the type-structure of our function, which I'll define as IEnumerable -> 'T option seq
. We also want to define what happens if an invalid cast is attempted? I'm going to return a 'T option
, that defines the result. If it's Some 'T
, then we casted successfully, if it's None
, then we didn't.
A basic, lazy implementation might look as follows:
let castSafe<'T> (items:IEnumerable) : 'T option seq = items |> Seq.cast<obj> |> Seq.map (fun node -> match node with | :? 'T as result -> Some result | _ -> None)
So we're immediately casting to obj
to allow the type-system to make it's next move. Since XmlNodeList
doesn't implement IEnumerable<T>
we can't enumerate each item as an XmlNode
: we have to enumerate as an obj
. Then we match
on the type: :? 'T
says "Is node
a 'T
type?" If it's a yes, then we return Some result
which is the node casted as 'T
, if not we return a None
.
Of course this means on the client-side we would have to test for Some node
or None
, which we don't want to do. So we'll also build a castSafeFilter
which will cast to 'T
and remove the None
entires. A naive implementation could look as follows:
let castSafeFilter<'T> (items:IEnumerable) : 'T seq = items |> castSafe |> Seq.filter (fun node -> match node with | Some x -> true | None -> false) |> Seq.map (fun node -> node.Value)
But we don't want to do that: we're casting then testing for Some
/None
, but there's a more effective manner: we can filter first. Enter the :?>
operator. This will do a direct cast to a type lower in the type hierarchy, that is: this will do a runtime cast to a sub-type. (So if we cast XmlNode -> obj
, this let's us go back to XmlNode
.) I wish this were synonymous with the as
operator in C#, but it's more like an explicit cast: (T)obj
.
Our new function looks something like this:
let castSafeFilter<'T> (items:IEnumerable) : 'T seq = items |> Seq.cast<obj> |> Seq.filter (fun node -> node :? 'T) |> Seq.map (fun node -> node :?> 'T)
Do the <obj>
cast immediately, but then filter out all non-'T
items then cast.
By doing it like this we eliminate the double-map, and do a single map to 'T
if and only if it is a 'T
. More interesting: we can add these two methods directly to the Seq
module just as if they were extension methods:
module Seq open System.Collections let castSafe<'T> (items:IEnumerable) : 'T option seq = items |> Seq.cast<obj> |> Seq.map (fun node -> match node with | :? 'T as result -> Some result | _ -> None) let castSafeFilter<'T> (items:IEnumerable) : 'T seq = items |> Seq.cast<obj> |> Seq.filter (fun node -> node :? 'T) |> Seq.map (fun node -> node :?> 'T)
Usually I would omit type-annotations, but when building a generic, multi-purpose module I always keep them. We could define an fsi
file instead that would contain all the signatures, but that's beyond the scope of what I wish to explain here.
Finally, we're back to our code, you had:
let items = channelNode.ChildNodes |> Seq.cast<XmlNode> |> Seq.filter (fun node -> node.Name.Equals "item") |> Seq.map itemNodeToItem
We're going to change that to:
let items = channelNode.ChildNodes |> Seq.castSafeFilter<XmlNode> |> Seq.filter (fun node -> node.Name.Equals "item") |> Seq.map itemNodeToItem
And bam, now we have safely casted the items to an XmlNode
, and allowed ourselves a great deal of reuse in the future.
And finally, with some extra cleverness, we could even devise a castFilter
method, that would test if our node
is a 'T
, and if so do the filtering at that level to avoid redundant calls. That is:
let castFilter<'T> (filter:'T -> bool) (items:IEnumerable) : 'T seq = items |> Seq.cast<obj> |> Seq.filter (fun node -> match node with | :? 'T as result -> result |> filter | _ -> false) |> Seq.map (fun node -> node :?> 'T)
And interestingly if we work with result
on the filter at that point, the :? 'T as result
tests if node
is a 'T
, and does the node :?> 'T
cast for us, so we've finally achieved the same functionality as the C# as
keyword.
And as mentioned in comments, using Seq.choose
is another great alternative, which opens us up to a little bit of extraction:
let private castItemSafe<'T> (item:obj) : 'T option = match item with | :? 'T as result -> Some result | _ -> None let castSafe<'T> (items:IEnumerable) : 'T option seq = items |> Seq.cast<obj> |> Seq.map castItemSafe let castSafeFilter<'T> (items:IEnumerable) : 'T seq = items |> Seq.cast<obj> |> Seq.choose castItemSafe
Because of the nature of our casting, we really only need to store the function which casts an individual item in one spot, since map
will return all items and choose
will naturally filter them out.
I previously touched on some basic composition stuff, and I want to continue that discussion with some new examples here.
We have opportunity to use some function composition in this code, which lends us to partial application, and allow us to not care about the node parameter to our functions.
For example, we can take your titleText
and descriptionText
functions and simply omit the node
parameter and our functions still work the same, thus leaving them agnostic to what's applied even moreso than they were. But what if we want to omit node from elementAsDateTimeOffset
? Well we can't really do that as it's written, because the pipe-right (|>
) operator expects a value, not a function. Now we can use the compose-right (>>
) operator to eliminate that requirement:
let titleText = innerText "title" let descriptionText = innerText "description" let elementAsDateTimeOffset elementName = innerText elementName >> DateTimeOffset.Parse let elementAsUri elementName = innerText elementName >> Uri let linkUri = elementAsUri "link"
Now here is the part I really like: none of our signatures changed, but we no longer care about the parameter itself in these functions. We've partially applied the functions. For example, on titleText
it is simply applying the "title"
string to the innerText
function (which expects two parameters) and returning a function that just needs an XmlNode
applied to it.
I like to use this more and more in my code as it seems a bit clearer to me in the end. The net result is the same, and there's no significant advantage (that I know of - do note I'm no expert) to using one over the other, but it becomes a preference on the developer's end. (This also lends itself very easily towards the introduction of currying: that is, every function has one and only one parameter, even those with multiple parameters.)