Strings and String conversions (in Haskell)¶
Representation of Strings is a sore spot in Haskell, unfortunately.
The fundamental problem is that the ‘default’ String, the String
type from the standard library, is a linked list of characters.
Nicely enough it is unicode capeable and handles special characters nicely, however using linked lists as strings is very inefficient.
Therefore marvin uses a more efficient string type called Text
.
To be precise the Text
type in Data.Text.Lazy
from the text library.
Functions exposed by the marvin library generally ALL deal with this string type, to make it as easy as possible for the user.
However when you interact with other libraries you might encounter other string types, such as ByteString
(often the result of HTTP requests or input for JSON decoding) and String
from the standard library, often in the form of FilePath
as name for files and directories.
Strict and lazy Text
and ByteString
¶
Furthermore both Text
and ByteString
have a strict and a lazy variant.
It is not really necessary to know the difference between the lazy and strict variants of these strings, suffice to say they are not the same thing.
If you need to convert between the strict and lazy variants of these strings there is a fromStrict
(converts strict to lazy) and toStrict
(converts lazy to strict).
This is the case for both Text
and ByteString
.
The two conversion functions toStrict
and fromStrict
are always contained in the module holding the lazy version of the type.
This means for Text
it is Data.Text.Lazy
and for ByteString
it is Data.ByteString.Lazy
.
If you want to know which String type you have, look at the module.
- strict
Text
comes from eitherData.Text
orData.Text.Internal
- lazy
Text
comes from eitherData.Text.Lazy
orData.Text.Internal.Lazy
ByteString
uses the same naming scheme, just replace Text
with ByteString
.
Converting String
¶
To convert a String
value to the string type used in marvin use pack
from the Data.Text.Lazy
module.
To convert a String
value from the string type used in marvin use unpack
from the Data.Text.Lazy
module.
You can use the same functions for FilePath
as it is only an alias for String
.
Converting ByteString
¶
ByteString
values are a lot harder to convert than String
, becuase they have no specified encoding.
To convert a ByteString
value you, first make sure it is a lazy ByteString
.
If it is not convert it with fromStrict
.
Now you’ll have to decode the ByteString
.
The text
library offers a number of decoding functions in the Data.Text.Lazy.Encoding
module.
The encode*
functions are used to create ByteString``s from text and the ``decode*
functions are used to turn ByteString``s into ``Text
.
If you are not sure what encoding your source text is try Utf8
, its the most common.