Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Require explicit, single-threaded initialization #91

Merged
merged 13 commits into from
Sep 18, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ jobs:

- run:
name: "Nix Build"
no_output_timeout: "30m"
no_output_timeout: "90m"
command: |
nix build --print-build-logs

Expand Down
3 changes: 2 additions & 1 deletion fec.cabal
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
cabal-version: 3.0
name: fec
version: 0.1.1
version: 0.2.0
license: GPL-2.0-or-later
license-file: README.rst
author: Adam Langley <agl@imperialviolet.org>
Expand Down Expand Up @@ -31,6 +31,7 @@ library
build-depends:
, base
, bytestring >=0.9
, extra

exposed-modules: Codec.FEC
default-extensions: ForeignFunctionInterface
Expand Down
34 changes: 31 additions & 3 deletions haskell/Codec/FEC.hs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
-}
module Codec.FEC (
FECParams (paramK, paramN),
initialize,
fec,
encode,
decode,
Expand All @@ -29,6 +30,8 @@ module Codec.FEC (
deFEC,
) where

import Control.Concurrent.Extra (Lock, newLock, withLock)
import Control.Exception (Exception, throwIO)
import Data.Bits (xor)
import qualified Data.ByteString as B
import qualified Data.ByteString.Unsafe as BU
Expand All @@ -42,7 +45,7 @@ import Foreign.ForeignPtr (
)
import Foreign.Marshal.Alloc (allocaBytes)
import Foreign.Marshal.Array (advancePtr, withArray)
import Foreign.Ptr (FunPtr, Ptr, castPtr)
import Foreign.Ptr (FunPtr, Ptr, castPtr, nullPtr)
import Foreign.Storable (poke, sizeOf)
import System.IO (IOMode (..), withFile)
import System.IO.Unsafe (unsafePerformIO)
Expand All @@ -57,6 +60,8 @@ data FECParams = FECParams
instance Show FECParams where
show (FECParams _ k n) = "FEC (" ++ show k ++ ", " ++ show n ++ ")"

foreign import ccall unsafe "fec_init"
_init :: IO ()
foreign import ccall unsafe "fec_new"
_new ::
-- | k
Expand Down Expand Up @@ -101,6 +106,24 @@ isValidConfig k n
| n > 255 = False
| otherwise = True

{- | The underlying library signaled that it has not been properly initialized
yet. Use @initialize@ to initialize it.
-}
data Uninitialized = Uninitialized deriving (Ord, Eq, Show)

instance Exception Uninitialized

-- A lock to ensure at most one thread attempts to initialize the underlying
-- library at a time. Multiple initializations are harmless but concurrent
-- initializations are disallowed.
_initializationLock :: Lock
{-# NOINLINE _initializationLock #-}
_initializationLock = unsafePerformIO newLock

-- | Initialize the library. This must be done before other APIs can succeed.
initialize :: IO ()
initialize = withLock _initializationLock _init

-- | Return a FEC with the given parameters.
fec ::
-- | the number of primary blocks
Expand All @@ -115,8 +138,13 @@ fec k n =
unsafePerformIO
( do
cfec' <- _new (fromIntegral k) (fromIntegral n)
params <- newForeignPtr _free cfec'
return $ FECParams params k n
-- new will return null if the library hasn't been
-- initialized.
if cfec' == nullPtr
then throwIO Uninitialized
else do
params <- newForeignPtr _free cfec'
return $ FECParams params k n
)

-- | Create a C array of unsigned from an input array
Expand Down
51 changes: 41 additions & 10 deletions haskell/test/FECTest.hs
Original file line number Diff line number Diff line change
Expand Up @@ -6,24 +6,25 @@ import Test.Hspec (describe, hspec, it, parallel)

import qualified Codec.FEC as FEC
import qualified Data.ByteString as B
import Data.Int ()
import qualified Data.ByteString.Lazy as BL
import Data.List (sortOn)
import Data.Serializer ()
import Data.Word (Word16, Word8)

import System.Random (Random (randoms), mkStdGen)
import Test.QuickCheck (
Arbitrary (arbitrary),
Property,
Testable (property),
choose,
conjoin,
once,
withMaxSuccess,
(===),
)
import Test.QuickCheck.Monadic (assert, monadicIO, run)

-- Imported for the orphan Arbitrary ByteString instance.

import Control.Monad (replicateM_)
import Test.QuickCheck.Instances.ByteString ()

-- | Valid ZFEC parameters.
Expand Down Expand Up @@ -103,9 +104,36 @@ prop_deFEC (Params req tot) testdata =
allShares = FEC.enFEC req tot testdata
minimalShares = take req allShares

prop_primary_copies :: Params -> BL.ByteString -> Property
prop_primary_copies (Params _ tot) primary = property $ do
conjoin $ (BL.toStrict primary ===) <$> secondary
where
fec = FEC.fec 1 tot
secondary = FEC.encode fec [BL.toStrict primary]

main :: IO ()
main = hspec $
parallel $ do
main = do
-- Be sure to do the required zfec initialization first.
FEC.initialize
hspec . parallel $ do
describe "encode" $ do
-- This test originally caught a bug in multi-threaded
-- initialization of the C library. Since it is in the
-- initialization codepath, it cannot catch the bug if it runs
-- after initialization has happened. So we put it first in the
-- suite and hope that nothing manages to get in before this.
--
-- Since the bug has to do with multi-threaded use, we also make a
-- lot of copies of this property so hspec can run them in parallel
-- (QuickCheck will not do anything in parallel inside a single
-- property).
--
-- Still, there's non-determinism and the behavior is only revealed
-- by this property sometimes.
replicateM_ 20 $
it "returns copies of the primary block for all 1 of N encodings" $
withMaxSuccess 5 prop_primary_copies

describe "secureCombine" $ do
-- secureDivide is insanely slow and memory hungry for large inputs,
-- like QuickCheck will find with it as currently defined. Just pass
Expand All @@ -115,8 +143,11 @@ main = hspec $
it "is the inverse of secureDivide n" $ once $ prop_divide 1024 65 3

describe "deFEC" $ do
it "is the inverse of enFEC" (withMaxSuccess 2000 prop_deFEC)

describe "decode" $ do
it "is (nearly) the inverse of encode" (withMaxSuccess 2000 prop_decode)
it "works with required=255" $ property $ prop_decode (Params 255 255)
replicateM_ 10 $
it "is the inverse of enFEC" $
property prop_deFEC

describe "decode" $
replicateM_ 10 $ do
it "is (nearly) the inverse of encode" $ property $ prop_decode
it "works with required=255" $ property $ prop_decode (Params 255 255)
2 changes: 2 additions & 0 deletions zfec/_fecmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -673,6 +673,8 @@ init_fec(void) {
py_fec_error = PyErr_NewException("_fec.Error", NULL, NULL);
PyDict_SetItemString(module_dict, "Error", py_fec_error);

fec_init();

#if PY_MAJOR_VERSION >= 3
return module;
#endif
Expand Down
26 changes: 19 additions & 7 deletions zfec/fec.c
Original file line number Diff line number Diff line change
Expand Up @@ -393,12 +393,23 @@ _invert_vdm (gf* src, unsigned k) {
return;
}

/* There are few (if any) ordering guarantees that apply to reads and writes
* of this static int across threads. This is the reason for some of the
* tight requirements for how `fec_init` is called. If we could use a mutex
* or a C11 atomic here we might be able to provide more flexibility to
* callers. It's tricky to do that while remaining compatible with all of
* macOS/Linux/Windows and CPython's MSVC requirements and not switching to
* C++ (or something even more different).
*/
static int fec_initialized = 0;
static void
init_fec (void) {
generate_gf();
_init_mul_table();
fec_initialized = 1;

void
fec_init (void) {
Copy link

@itamarst itamarst Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread synchronization isn't immediate. If thread 1 and 2 are both started, and thread 1 calls init, thread 2 won't necessarily see updated versions of the memory modified by thread 1, and even if it does it may be inconsistent across different memory locations. Put another way, modifying the static does not create a happens-before relationship, except for threads started after init_fec() is called.

The solution is to make fec_initialized an atomic, which is supported by modern C, and then read/write it with acquire/release atomic ordering.

https://marabos.nl/atomics/ is good book about this; Rust's model is based on C++'s model, and C's model is probably also C++'s model, so the semantics will all be the same even if the APIs are slightly different.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative is to just create a static lock that needs to be acquired to initialized or check initialization.(Acquiring/releasing a lock is equivalent to acquire/release semantic on an atomic variable, so it too creates a happens-before relationship.)

This would simplify the API in that you don't need to make fec_init() public, you could just always call it. It would also make fec_new() slightly slower, but probably not in a very meaningful way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, great points here. From some simple experiments, it looks like neither C11 "atomics" nor "threads" are available from the MSVC we get from current CI (eg, stdatomic.h exists but merely including it leads to numerous errors). I don't quite understand this since we're using the Windows 2022 image and https://github.com/actions/runner-images/blob/main/images/win/Windows2022-Readme.md says this includes MSVC 17.7. The build failures have a version number that looks like "14.35" ... but maybe that's not the MSVC version? Maybe https://reactos.org/wiki/Visual_Studio_Versions is suggesting that "14.35" actually means "17.7". I don't know.

So for now I'm going with "documentation". Hopefully later someone can figure out how to get at atomics/threads on Windows and we can strengthen the implementation.

Copy link

@hacklschorsch hacklschorsch Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't a mutex suffice here? Maybe that is available?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A statically-initializable mutex here would work, yes. There isn't one in C99 so we'd need a third-party library. There is probably one but I am reluctant to give zfec it's first third-party build-time dependency. It seems likely that pthreads offers some kind of solution here, which would at least solve the problem for Linux (maybe macOS too?), but it's not the obvious pthread_mutex_t because those are not statically initializable (so all you do is move the initialization problem from "coordinate zfec initialization" to "coordinate zfec initialization mutex initialization"). Windows probably has its own solution but I don't know what it is offhand either (there's at least 5 solutions that aren't statically initializable, for sure).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have vague memory that certain Python releases require certain MSVC versions? So older Python might get older MSVC or something.

if (fec_initialized == 0) {
generate_gf();
_init_mul_table();
fec_initialized = 1;
}
}

/*
Expand Down Expand Up @@ -428,8 +439,9 @@ fec_new(unsigned short k, unsigned short n) {
assert(n <= 256);
assert(k <= n);

if (fec_initialized == 0)
init_fec ();
if (fec_initialized == 0) {
return NULL;
}

retval = (fec_t *) malloc (sizeof (fec_t));
retval->k = k;
Expand Down
12 changes: 12 additions & 0 deletions zfec/fec.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,18 @@ typedef struct {
#define restrict
#endif

/** Initialize the fec library.
*
* Call this:
*
* - at least once
* - from at most one thread at a time
* - before calling any other APIs from the library
* - before creating any other threads that will use APIs from the library
*
*/
void fec_init(void);

/**
* param k the number of blocks required to reconstruct
* param m the total number of blocks created
Expand Down
Loading