|
Software
Wiki Docs
|
Todo
This page is read-only because I use it as a primitive bug tracker. Please report bugs to me via email rather than try to add them to this page.
Compiler
word32PrimTy is hardcoded in runtimeErrorTy in MkId
stablePtrPrimTyCon probably needs to go (just use Addr#)
- Use -fext-core to test with 64-bit int (I think the build system supports this directly), need to fudge WORD_SIZE_IN_BITS somehow, the frontend should probably define it directly, and not depend on random include files (which I'd like to get rid of completely for java)
- Once we have unified
+# ==#, etc look into populating the NPat eq and mb_neg syntaxexpr fields, dunno if we could actually use them though
- Actually I think populating these (and the fromInteger syntaxexpr in HsOverLit) will eliminate many of the ol_unboxed special cases
- unliftedKind happens automatically from cast#, etc
- We'll never actually emit the cast# (n# :: Int64#) but it makes things nicer for the typechecker
- The renamer will have to handle the initial lookup of the syntaxexprs specially (fromInteger -> cast#, negate -> negate#, subtract -> -#)
- I think we can actually make n+k patterns work with this too
- Also, remove the negative primitive int literal magic from the lexer, NegApp should be able to handle it
- TcGenDerive is a mess, revert and try to do it again cleaner (cleaner being less changes)
- Parser ensures Primitive int literals are in Double/Int64 range (so compile can never fail)
- Fix the bad_args check (be way more strict) in coreToStg, imho we should just get rid of unsafeCoerce# alltogether
- We might need to replace it with some kind of newtype cast operator thingamajig (see libraries for uses)
- Move mkIntegerExpr, mkCharExpr (maybe rewrite mkCharExpr in terms of lookupId a-la mkIntegerExpr) into desugarer
- Ditch the tycon arity?
- In isSubKind change is_leaf to is_branch (just invert it), one less case
- I think we should move to MachInt8 Int8, MachWord32 Word32, MachFloat Float32, etc and constant folding with overflow
- Look into making GHC.Prim less magic
- It would be a somewhat normal module, with a real .hi file
- Need to add some bogus syntax for primitive types (they have no definition and the real types will be picked up by the normal wired-in magic)
- Need to add some syntax for inline primops (foreign import inline primitive?, unsafe primitive?, do a read :: String -> PrimOp on the entity?) This way Int# is in scope for type signatures
- Some primitive foreign imports would go here (stuff like raise# :: Exception -> a would have to go where Exception is defined though)
- would define type Int# = Int32#, etc
- Surely some new IfaceSyn syntax would be required
- PrelRules would have to be magiced in. We might be able to eliminate the code to attach rules to an Id
- Since we're making most (if not all) of the inline primitives overloaded we might as well do away with PrimOpWrappers and just implements the primitives in bytecode in GHCI
- Blah
- Have only Void# :: * -> ## void type
- doStupidChecks checks for this on primops
- punt on the float bit ops, allow them for now, call it a bug in the codegen
- I think we can get rid of RealWorld and Any (and hence the is_lifted field in PrimTyCon, be careful with Dynamic though (maybe alg types with no dcons are not data types?)
- For nativecodegen checkcast#, foreign imported types have the label of the info table as their external name, checkcast compares with that, if it is a zero length string (as it would be for Ref#), a noop. could cause trouble for ptr arrays that switch between a few info ptrs...
- Possible plan B, stick some kind of description in the info table, I think we can change the ThunkInfo to a ConstrInfo in cmm/CmmParse.y (for INFO_TABLE) to cobble a string in there, this might cause trickiness for Thread# though (and other deeply magical types)
- Not sure what happens if we make array ops can_fail (could be null or out of bounds) I don't think they are now
- Can we make seq work on unlifted types? let x = indexArray# a i in x `seq` Box# x?
- We should use explicit foralls in primops.txt, so we know the order of type args, for ad-hoc checks in typechecker
- Put the ad-hoc checks in primops.txt (blah = { \tc -> isForeignTyCon tc }, blah = {\tc -> tyConUnique `elem` [int8,int16,...]), etc)
- Remove PrimOpResultInfo (ReturnsPrim/ReturnsAlg, we always ReturnPrim after the Bool# thing)
- Bring back primrep in the form: data PrimRep = Integral Bool
signed Size | Floating Size | Void | Ptr FastString
Libraries
- type Int# = Int32# in GHC.IntWord64
- chr/unsafeChr/ord in GHC.Base assume Int# = Char#
- GHC.Storable assumed Int# = Char#
- Go with a plain old huge byte[] heap, with malloc, etc
- peekElemOff#, pokeElemOff# map to GHC.RTS.Heap.peekElemOff, etc
- FD is GHC.FD, a class with subclasses for RandomAccess, InputStream, etc (this used to be handled in haskell itself with a datatype but this is cleaner for IOBase)
- better: FileHandle is always a RAF, DuplexHandle is always an InputStream/OutputStream pair
- Not sure what to do for read-only streams, either have DuplexHandle be a (Maybe Handle__) (Maybe Handle__), or make the other side a ClosedHandle type, or put a dummy {Input,Output} stream in there...
Notes from the C standard
- When anyscalar value is converted to_Bool, the result is 0 if the value compares equal to 0; otherwise, the result is 1.
Done
mkMachInt32/mkMachWord32 hardcoded in PrelRules
Lots of Int64 cases left out of PrelRules@@
primRepSizeW in TyCon is totally bogus
fix shortCutLit again (broken when we removed HsInt again)
primRepToCgRep is non-exhaustive
DsForeign.primTyDescChar is non-exhaustive panicked
mkSimpleLit is wrong if we go with always >= word size SMReps (Int32 isn't always I32)
do something different with nsHsIntLiteral in TcGenDerive (see above though)
I think we can get rid of intDataConName (and probably float, double, etc, maybe more)
in shortCutLit rather than doing a lookup on tyconkey switch on tyconprimrep no
int32PrimTy is hardcoded in intDataCon in TysWiredIn
Still commented out, get rid of it totally
mkIntExpr commented out in MkCore
Move explicit ranges in TcHsSyn (for shortcut literal) to inInt8Range, etc in Literal
Get rid of wired in Double/Float types like we did/will do for Int/Word
Nuke or fix IfaceIntTc
panic in hsLitType for HsInt
panic in dsLit for HsInt
int32PrimTy/word32PrimTy are hardcoded in gen_PrimOrd_binds,genAuxBind,eq_op_tbl,lt_op_tbl,box_con_tbl in TcGenDeriv
int32PrimTy/word32PrimTy are hardcoded in ParserCore
Add 'U' suffix for unsigned integer (Word32#,Word64#) unboxed literals
Re-enable shortCutLit for ints and floats (TcHsSyn)
Re-enable short cuts for int/float in tidyNPat in MatchLit
Add int range checks to lexer/paser (and put asserts back in mkMachInt32 and friends in Literal)
All kinds of bogus hardcoded MachInt32s in coercion functions in Literal (which will be rewritten anyway)
mkMachInt32/mkMachWord32 hardcoded in CoreSyn
Bring back HsInt HsLit, but with a PostTcType, use this in TcGenDerive, maybe add an -XNoInteger flag to disable integer overloading (always Int) no
word32PrimTy is hardcoded in wordDataCon in TysWiredIn
int32PrimTy is hardcoded in parrDataConDataCon in TysWiredIn
int32PrimTy/word32PrimTy are hardcoded in literalType in Literal
word32PrimTy is hardcoded in hsLitType in TcHsSyn
int32PrimTy,word32PrimTy are hardcoded in unboxArg in DsCCall
int32PrimTy,@word32PrimTy,mkMachInt32 are hardcoded in resultWrapper@@ in DsCCall
int32PrimTy is hardcoded in getPrimTyOf in DsForeign
word32PrimTy is hardcoded in dsCImport in DsForeign
word32PrimTy is hardcoded in dsFCall in DsForeign
word32PrimTy is hardcoded in ret_addr_arg in DsForeign
WRITE THIS DOWN:
for foreign imports auto box/unbox primitive types
JType -> (ClassRef,Insn,Insn) (box, box op, unbox op)
- this removed one of the major reasons for the ptrTypeKind
- maybe we don't need it after all...
- mkWeak# via ffi?
MAYBE
merge MutableArray and Array,
freeze :: Array# -> IO ()
- all array accesses are checked, use trap instruction on ppc
- maybe also uncheckedReadArray#
- always use unsafeIndex
- ditch sin/cos/tan as primops
- pull pi from java.lang.Math? (would need optimized final field imports)
use explicit foralls/kinds in primops.txt
- prevents us from having to make up weird alpha tyvar letters
- add LongDouble# ??
- maybe clean up foreign import types
- no State# for pure functions?
- tack a Void# on nullary functions
- they don't always return unboxed tuples (could be Int#, or even Void#)
State# :: ()
{Int,Word}{8,16,32,64}# :: +
{Bool,Addr}# :: &
?
/ \
?? (#)
/ \
* #
/ \
## ()
/ \
- +
/ \
& !
java malloc
byte[][]
(x>>32) is block
(int)x is offset
calculate block with hashcode and quadratic probing
when too many probes needed resize
- I think the enumeration tycons as product types optimization should be easy
- Introduce top level bindings for all data cons (no need to alloc)
- desugar pattern matches into (Box# tag#)
- tricky part is making the type look like one thing in the type checker and another in core, dunno is tcView/coreView help here
Random Thoughts on long standing problems
- Ditch FFI Type Iface Stuff
- We just say any datatype that is a product type with a single unboxed source argument (source prevents all H98 types from leaking in as FFI types) is FFI (and Bool as a hack)
- Java types
- Keep higher kinded types (i think the impl is in fotype_iface_stuff.patch)
- Checkcast, instanceof are special foreign imports
- Maybe make an optimized "final field" Id
- Maybe optimize upcasts to unsafeCoerce# in desugarer
- Polymorphic primops
- C-- like primops, (+#) :: forall bits . bits -> bits -> bits
- Do the UArray# :: # -> ! thing, maybe keep Array# separate, dunno
- This blows the PrimOpWrappers idea away, we're just going to say primops are unsupported in ghci (like unboxed tuples)
- Int# is gone, just {Int,Word}{8,16,32,64}
- Keep cast#, we're all for polymorphic primops now
- Target word size still leaks in with readOffAddr# :: WORD# -> INT# -> State# s -> (# State# s, bits #) and readArray#, not much we can do here without given a false sense of portibility
- Integer
- Ditch integer primops
- use the low level (
add(mb_limb_t *dst, mb_limb_t *x, int xlen, mb_limb_t *y, int ylen) gmp primops
- haskell always takes care of allocating, use ffi to get at gmp
- S# Int32#, go via long for overflow checking
- Possibly look for overflow-checking like patterns in the codegen and optimize them
- Pinned Arrays/Addr
- These are the only non-java primops left
- Support a "fake heap" of the form int[][], like nestedvm, but with huge pages, (128k maybe), 1 page per allocation so we can fake malloc easily, optionally use a NestedVM style memory space
- Pinned arrays are no longer ByteArray#s, special type (ForeignPtr#?). ForeignPtr in java is just an int with a finalizer attached
No good anymore
- Only expose {Int,Word}{8,16,32,64},
newtype Int = Int Int32 or newtype Int = Int64
- We have
add32# :: Int32# -> Int32# -> Int32#, add64#, etc
- All ops work on every arch
Int32# is the center of the universe
int8ToInt32#, int16ToInt32, int64ToInt32#, etc
type Char# = Int32#, char isn't so special
- Ditch overflow/carry ops, S# Int32#, but do ops in terms of Int64# (which is always available now)
add (S a) (S b) = toInteger (fromIntegral a + fromIntegral b :: Int64)
toInteger x = let { sign = x `shr` 31 } if sign == 0 || sign == -1 then S x else J (int64ToInteger# x)
- Array# :: ?? -> ! is probably a good idea, but with the new Int/Word plan maybe just enumerate all types
- definitely ArrayRef# :: ! -> ! (need for class types)
- See how hard unified array would be to deal with in GHCi (since we can't PrimOpWrapperify
readArray#)
- cast# is probably a bad idea (painful in GHCi), and new Int/Word plan greatly simplifies things
High Priority
- Make it so HEAD can rebuild itself again
Rest
- go through all library changes
- go through primops.txt and disable more non-java stuff
- need to update updatable closures with a "throw thunk" when there is an exception
- Get the java exception handling integration going
- turn off GHC.Arr's bounds checks (make sure OutOfBoundException turns into whatever hs exn should be thrown)
- Same with div by zero
- Stable names
- Concurrent Haskell
- Goto 0 tail recursion optimization
- letnoescapes
- add "enabled" to primtypes
- Would addIntC# and friends be more efficient via long? That's what BigInteger does
- Desugar genuine Ints >= 2^29 to broken down Int operations, not Integer, overflow should be the same
isXXXTypeKind (tyConKind ...)) is wrong, I think we do it in a few places, should the the *result* kind
- Ordering is an FFI type, if glasgow-exts, desugars to if x < 0 then LT else if x > 0 then GT else EQ
- Implement JVMUpCast/JVMCheckCast/JVMInstanceOf
- Lots of foreign calls can be "cheap" (field accesses, checkcast/instanceof, etc)
- Turn upcasts into unsafeCoerce# in the desugarer
- Does the realWorld# token on non-io foreign imports hurt at all?
- Do a codegen check on foreign types, just like foreign imports
- isFFIPtrTy -> isFFIObjTy?
- Make sure ghc -java --make works.
- -odir should create parent dirs like javac's -d
- kind defaulting for foreign decls should be done in kcTyClDecl I think (look at type families)
- "implements" entity should also map to upcast
- tcForeign should check for valid java identifiers
- GCode.PrimTy should be TyApp now, i don't think we need lifted
Minor
get rid of Type.PrimRep, put it in the codegen (probably can go right from type to SMRep, I don't think primrep is used anywhere else)
getPrimOpResultInfo too
- tyConPrimRep too
- Port HEAD Data.ByteString to java, include in "core-packages", ensure the regex packages work now
- update the ghc users guide
- try to reduce dependency on hsutils
- (eventually make a "hsjava-lite" that lives in the ghc source tree)
- the simplifier nukes unfolding info for "loop breakers" which means in
let { xs = 1 : xs } in xs xs doesn't look like it is evaluated
- Do something better with stubbed out modules, just don't compile them in the first place
- System.Process
- System.Time
- System.Directory
- build process doesn't get hsc/java stuff
gcc -O -c System/CPUTime_hsc.c -o System/CPUTime_hsc.class
- try to thread line numbers down to java
- better name for GCode
- hsjava peephole optimizer,
`ltWord#` int2Word# 0x10ffff generates this
cleanup NewUArray element size swizzling in cmm backend
- MO_*_Conv for I64 isn't implemented (at least on ppc), the libraries never actually call this, but it should work
litIsTrivial: strings are trivial in java (stored in constant pool), should be via-c too, maybe not nativegen? right, not worth fixing
- Java's Int -> Float conversion saturate the int rather than converting it mod 2^32, this makes truncate :: Float -> Int return different results than (fromIntegral :: Integrer -> Int) . (truncate :: Float -> Integer) which is arguably a bug
- could be eliminate some of the ifdefs in GHC.Word and GHC.Int with some rule like x <= MAX_INT = True ?
- tag2con's switch can subsume the bounds check required by toEnum and friends (all this is in TcGenDeriv)
- Fix Data.Array.*
- I think it need a rewrite, base it off an "Unboxable" class, which is derivable
- We might want to make unsafeFreeze a standalone method that only works over Array/STArray, which is the only place it worked anyway
- Not sure about feeeze/thaw via arrayCopy though
- Maybe we just punt on the generic interface to them too, nobody else's instances will be able to be optimized anyway
- Cleanup GHC.Num
- Split out big integer stuff. We'll have GHC.BigInteger, which exports BigInteger which implements Num and friends
- Integer is just a thin wrapper around either Int or BigInteger
- do something sane with Void types in array and cast ops
- cast is easy (either dump it or make up "42" or something)
- not sure about array, obviously all nops but what about length?
- Something isn't write with the mkAlphaTyVars uniques, there are some magic unique numbers I don't know about
- splitOpenTypeRepCo_maybe does an unnecessary coreView, should either be just splitTyConApp or pattern match on TyConApp
Done
re-enable closure field nuking
Get the the testsuite and nofib working again, run it regularly The testsuite kind of sucks, half of it fails with the stock GHC
make GCodeAlloc make the distinction between definition and use
check that we don't have two binds with the same Var loc in the env, if so we might put an in-use var on the free list
put "fake" primops (64-bit ops, java big integer ops, etc) in GHC.Base (or somewhere central) GHC.Num is central enough
maybe use Void# type in fake primops to maintain symmetry
rewrite bi_* functions to look like integer primops
cast# (or maybe unsafeCast#) for ALL casts. chr#, etc defined in terms of it, maps to MO_*_Conv and Convert
We could add a prelude rule that turns unsafeCast# u v where u and v are rep'd the same into unsafeCoerce#
this would invoke peeking into the code gen which is sort of ugly but tolerable, much better than needing "pseudo primops" that map to unsafeCoerce# in Base, this would also avoid having so much WORD_SIZE related stuff in base not worth it
Reenable prelude rules for coercions
Add some "magic literals" for WORD_SIZE_IN_BITS, etc nope, GHC.Base is so tied to the target arch anyway this doesn't buy much
nonIOOk in TcForeign for field setter, doesn't make sense
maybe virtual hsjava instructions for unsigned types (in terms of I2L/L2I/AND/etc) No dice
() -> RealWorldState in foreign imports, all foreign imported functions have the result type (# State# RealWorld, foo #)
maybe have a new "unit" unboxed ty, rep'ed by Void still
StgFCallOp would now store only the foo's type, not the whole unboxed tuple
"Warning: Defined but not used: data constructor" when dcon is only used in ffi imports
base36 uniques
Dangerous-looking argument. Probable cause: bad unsafeCoerce# in GHC/Weak.hs
ensure small word (word8,word16) ops are done in terms of the faster integer ops, not the slow 32-bit word ops
in fact, check all uses of word ops in base, stuff like chr is implemented using int2word/ltWord
GCode is parameterized over the local var index type, we generate it with Uniques then a second pass does local var allocation (and maybe nuking)
mv primops.txt.pp to primops.txt, no need for it to be preprocessed anymore
add a flag to disable primops for certain codegens, so java doesn't even see indexOffAddr#, etc
fix the mess I made of GenPrimOpCode
maybe just say addrPrimTy `notElem` args for java? I think that covers it
Hrm... except for forkIO, threadWaitRead#, etc
deriving Eq for primitive foreign types
checkcast# :: u -> v, null# :: u, eqObject# :: u -> u -> Bool
u and v MUST be foreign types, this will be checked by that tcHugeHugeUglyHack function in the typechecker new plan, see below
how about instanceof# :: u -> (# Bool#, v #)
maybe think about some way of unifying the wired in String#, ThreadId#, etc and their foreign imported counterparts (right now they look like two different types, also unifying equal foreign imports) feature
Hrm... we could just put ext_name on those wired in prims. I don't think it would hurt the native codegen. It would be java specific though. Or we could argue it is a feature (like primitive newtypes or something) and punt on the whole thing
ditch forkio primitive, create new threads with new Thread(Runnable r), Closure implements Runnable (only valid for IO ()), IO () -> StablePtr# (IO ()) -> Runnable# later
we don't really need Stmnt.If, just use case on 0/1, maybe catch it in the backend
sort out the DCon stuff in StgExpr
why StgConApp and not StgLet/return? No good reason, probably hangover from register return, gone now
what about no-arg cons, always point to the worker?
we need to discard more of the environment (Keep) when doing and StgApp in an eval context outside a case (happens in case of case). we don't get any info on this from Stg good enough for now
I'm not sure why we have Nop
A bunch of code in PrelRules makes some crazy assumptions about the host/target being the same (I just squeak by because I'm on a 32-bit arch)
IMHO, the code shouldn't use Data.Bits, Data.Word, fromIntegral, etc at all, everything should be implemented on top of Integer.
Same with the Float/Double and Rational
Could we even get by assuming the target has an infinite number of bits in it's words and having the code gen truncate lits when it emits them? sure the compiler wouldn't be able to tell that MAX_INT + 1 == MIN_INT but I think that's ok
Maybe better (more conservative) would be to get the target's word size from dynflags somehow (need to pipe that though though)
implement optimized Case of StgApp (like in CgCase) good enough
org.haskell.ghc.rts is a mess
instance Outputable G.Module
maybe get ghc to delete Blah$*.class before outputing Blah.class and friends not really ghc's job
Core-to-Core pass to turn case of lit float to tree search (must still get the simplifier to hit it though)
Bool# is guaranteed to be 0 or 1 (maybe include Boolean literal), c foreign imports must map to Int# and do a case, java uses Bool# directly
make foreign exports a GlobalIdDetail, no need for the (Id,FastString) pair junk
switch to util/OrdList
allow String# literals to have chars 0-65536
can RawBuffer in IO/Handle be UArray# Int8# (avoid the &0xff on load)? nope, gotta happen sometime
add a check to the typePrimRep for unlifted tyvars
(I# 0x80000000) < 0 is constant folded to false because wraparound isn't taken into account
rules involving undefined behavior shouldn't fire (if shift sees constant < 0 or > 32 just pass it though, int2float of > max_int, etc)
Int64# literals, use 123## syntax (like double), fix GHC/Float to use them meh, not worth it
use Bool# for more primops meh
Is multiplication the same for signed and unsigned? instance Num Word64 seems to think so
cleanup all import lists
compile with -Wall -Werror
Split the unlifted kind into pointer and non-pointer types
nullRef# would have type forall ptr_u :: ptr_u
eqObject# (rename to eqRef#, also subsumes same*# functions) has ptr_u type
instead of isFFIClass we just check for ref kind, in javagen we map array types to Object for unclass
unsafeCast# only works over nonptr_u
deriving Eq for foreign tycons now works over ptr_u
Make string literals proper closure objects in nativegen
charAt# prim, use it in unpackString#
tag2enum# is a pain, dump it
isTrue# :: Bool# -> Bool (just a normal function now)
do we need if# or somesuch too? see how often prim cmps are used directly
Optimize switch on (Global XX (Just tag)) don't need anymore
mustExposeTyCon in TidyPgm need looking into. we probably aren't exposing unexported foreign type wrappers, isFFITy is all * go * {-check zonkQuantifiedTyVar, should we check for not lifted instead of unlifted?
Merge AddrRep and WordRep, and MachNullAddr and MachWord
maybe put a type field in MachInt, avoid some unsafeCoerce#ery in PrelRules
true# and false# will not be IntLits with a bool type (don't bother to make primops for them)
unsafeCast# (now just plain old cast#?) should work with bool types, toBool is i != 0, fromBool is id, no more IsTrueOp
bool is always 0 or 1, check Bool typed foreign imports to ensure they return the right ints
get rid of isPrimitiveType{,TyCon}, it seems to be confused with isUnLiftedType a lot, and I think the intention is usually isUnLifted
remove stableptr from java
cleanup unpackString#, no unsafeCoerce#s, proper utf-16 decoding everywhere
Unify Array# and UArray#
Elem would have a kind of ?? (ArgTypeKind)
Be careful with the freeze and thaw stuff, they are have to talk to the gc for PtrRep types
- {-Ensure kind defaulting works right, foo = indexArray# get forall'ed over a tyvar of kind *, yet still be possible to apply to an
unlifted-}
Check for initialization with a literal 0, avoid the call to Arrays.fill
check widen_cast in TcGenDeriv, stuff < 31bits should go via int not word
add Any# :: !, use it for Null type in litType
UnLifted -> Unlifted too scared of darcs conflicts
I really want to get rid of unsafeCoerce# (at least externally) would be nice, too much work to lose sleep over though
We probably need some way to express newtype casts
We'd need State# s1 -> State# s2 too (could we just use cast#?)
the builtin rules thing seems kind of ugly now that we're passing DynFlags all around, maybe clean it up
I'd like those rules to come in directly from loadIface for the primOpRules better
maybe add dflags as an arg to mkPrimOpId, both for the rules and to ASSERT that it is enabled
make sure unsafeCoerce# only coerces between compatible kinds
|