|
- utstring: dynamic string macros for C
- =====================================
- Troy D. Hanson <tdh@tkhanson.net>
- v2.3.0, February 2021
-
- Here's a link back to the https://github.com/troydhanson/uthash[GitHub project page].
-
- Introduction
- ------------
- A set of basic dynamic string macros for C programs are included with
- uthash in `utstring.h`. To use these in your own C program, just copy
- `utstring.h` into your source directory and use it in your programs.
-
- #include "utstring.h"
-
- The dynamic string supports operations such as inserting data, concatenation,
- getting the length and content, substring search, and clear. It's ok to put
- binary data into a utstring too. The string <<operations,operations>> are
- listed below.
-
- Some utstring operations are implemented as functions rather than macros.
-
- Download
- ~~~~~~~~
- To download the `utstring.h` header file,
- follow the links on https://github.com/troydhanson/uthash to clone uthash or get a zip file,
- then look in the src/ sub-directory.
-
- BSD licensed
- ~~~~~~~~~~~~
- This software is made available under the
- link:license.html[revised BSD license].
- It is free and open source.
-
- Platforms
- ~~~~~~~~~
- The 'utstring' macros have been tested on:
-
- * Linux,
- * Windows, using Visual Studio 2008 and Visual Studio 2010
-
- Usage
- -----
-
- Declaration
- ~~~~~~~~~~~
-
- The dynamic string itself has the data type `UT_string`. It is declared like,
-
- UT_string *str;
-
- New and free
- ~~~~~~~~~~~~
- The next step is to create the string using `utstring_new`. Later when you're
- done with it, `utstring_free` will free it and all its content.
-
- Manipulation
- ~~~~~~~~~~~~
- The `utstring_printf` or `utstring_bincpy` operations insert (copy) data into
- the string. To concatenate one utstring to another, use `utstring_concat`. To
- clear the content of the string, use `utstring_clear`. The length of the string
- is available from `utstring_len`, and its content from `utstring_body`. This
- evaluates to a `char*`. The buffer it points to is always null-terminated.
- So, it can be used directly with external functions that expect a string.
- This automatic null terminator is not counted in the length of the string.
-
- Samples
- ~~~~~~~
-
- These examples show how to use utstring.
-
- .Sample 1
- -------------------------------------------------------------------------------
- #include <stdio.h>
- #include "utstring.h"
-
- int main() {
- UT_string *s;
-
- utstring_new(s);
- utstring_printf(s, "hello world!" );
- printf("%s\n", utstring_body(s));
-
- utstring_free(s);
- return 0;
- }
- -------------------------------------------------------------------------------
-
- The next example demonstrates that `utstring_printf` 'appends' to the string.
- It also shows concatenation.
-
- .Sample 2
- -------------------------------------------------------------------------------
- #include <stdio.h>
- #include "utstring.h"
-
- int main() {
- UT_string *s, *t;
-
- utstring_new(s);
- utstring_new(t);
-
- utstring_printf(s, "hello " );
- utstring_printf(s, "world " );
-
- utstring_printf(t, "hi " );
- utstring_printf(t, "there " );
-
- utstring_concat(s, t);
- printf("length: %u\n", utstring_len(s));
- printf("%s\n", utstring_body(s));
-
- utstring_free(s);
- utstring_free(t);
- return 0;
- }
- -------------------------------------------------------------------------------
-
- The next example shows how binary data can be inserted into the string. It also
- clears the string and prints new data into it.
-
- .Sample 3
- -------------------------------------------------------------------------------
- #include <stdio.h>
- #include "utstring.h"
-
- int main() {
- UT_string *s;
- char binary[] = "\xff\xff";
-
- utstring_new(s);
- utstring_bincpy(s, binary, sizeof(binary));
- printf("length is %u\n", utstring_len(s));
-
- utstring_clear(s);
- utstring_printf(s,"number %d", 10);
- printf("%s\n", utstring_body(s));
-
- utstring_free(s);
- return 0;
- }
- -------------------------------------------------------------------------------
-
- [[operations]]
- Reference
- ---------
- These are the utstring operations.
-
- Operations
- ~~~~~~~~~~
-
- [width="100%",cols="50<m,40<",grid="none",options="none"]
- |===============================================================================
- | utstring_new(s) | allocate a new utstring
- | utstring_renew(s) | allocate a new utstring (if s is `NULL`) otherwise clears it
- | utstring_free(s) | free an allocated utstring
- | utstring_init(s) | init a utstring (non-alloc)
- | utstring_done(s) | dispose of a utstring (non-alloc)
- | utstring_printf(s,fmt,...) | printf into a utstring (appends)
- | utstring_bincpy(s,bin,len) | insert binary data of length len (appends)
- | utstring_concat(dst,src) | concatenate src utstring to end of dst utstring
- | utstring_clear(s) | clear the content of s (setting its length to 0)
- | utstring_len(s) | obtain the length of s as an unsigned integer
- | utstring_body(s) | get `char*` to body of s (buffer is always null-terminated)
- | utstring_find(s,pos,str,len) | forward search from pos for a substring
- | utstring_findR(s,pos,str,len) | reverse search from pos for a substring
- |===============================================================================
-
- New/free vs. init/done
- ~~~~~~~~~~~~~~~~~~~~~~
- Use `utstring_new` and `utstring_free` to allocate a new string or free it. If
- the UT_string is statically allocated, use `utstring_init` and `utstring_done`
- to initialize or free its internal memory.
-
- Substring search
- ~~~~~~~~~~~~~~~~
- Use `utstring_find` and `utstring_findR` to search for a substring in a utstring.
- It comes in forward and reverse varieties. The reverse search scans from the end of
- the string backward. These take a position to start searching from, measured from 0
- (the start of the utstring). A negative position is counted from the end of
- the string, so, -1 is the last position. Note that in the reverse search, the
- initial position anchors to the 'end' of the substring being searched for;
- e.g., the 't' in 'cat'. The return value always refers to the offset where the
- substring 'starts' in the utstring. When no substring match is found, -1 is
- returned.
-
- For example if a utstring called `s` contains:
-
- ABC ABCDAB ABCDABCDABDE
-
- Then these forward and reverse substring searches for `ABC` produce these results:
-
- utstring_find( s, -9, "ABC", 3 ) = 15
- utstring_find( s, 3, "ABC", 3 ) = 4
- utstring_find( s, 16, "ABC", 3 ) = -1
- utstring_findR( s, -9, "ABC", 3 ) = 11
- utstring_findR( s, 12, "ABC", 3 ) = 4
- utstring_findR( s, 2, "ABC", 3 ) = 0
-
- "Multiple use" substring search
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- The preceding examples show "single use" versions of substring matching, where
- the internal Knuth-Morris-Pratt (KMP) table is internally built and then freed
- after the search. If your program needs to run many searches for a given
- substring, it is more efficient to save the KMP table and reuse it.
-
- To reuse the KMP table, build it manually and then pass it into the internal
- search functions. The functions involved are:
-
- _utstring_BuildTable (build the KMP table for a forward search)
- _utstring_BuildTableR (build the KMP table for a reverse search)
- _utstring_find (forward search using a prebuilt KMP table)
- _utstring_findR (reverse search using a prebuilt KMP table)
-
- This is an example of building a forward KMP table for the substring "ABC", and
- then using it in a search:
-
- long *KPM_TABLE, offset;
- KPM_TABLE = (long *)malloc( sizeof(long) * (strlen("ABC")) + 1));
- _utstring_BuildTable("ABC", 3, KPM_TABLE);
- offset = _utstring_find(utstring_body(s), utstring_len(s), "ABC", 3, KPM_TABLE );
- free(KPM_TABLE);
-
- Note that the internal `_utstring_find` has the length of the UT_string as its
- second argument, rather than the start position. You can emulate the position
- parameter by adding to the string start address and subtracting from its length.
-
- Notes
- ~~~~~
-
- 1. To override the default out-of-memory handling behavior (which calls `exit(-1)`),
- override the `utstring_oom()` macro before including `utstring.h`.
- For example,
-
- #define utstring_oom() do { longjmp(error_handling_location); } while (0)
- ...
- #include "utstring.h"
-
- // vim: set nowrap syntax=asciidoc:
|