nasmdoc.src 325 KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991001011021031041051061071081091101111121131141151161171181191201211221231241251261271281291301311321331341351361371381391401411421431441451461471481491501511521531541551561571581591601611621631641651661671681691701711721731741751761771781791801811821831841851861871881891901911921931941951961971981992002012022032042052062072082092102112122132142152162172182192202212222232242252262272282292302312322332342352362372382392402412422432442452462472482492502512522532542552562572582592602612622632642652662672682692702712722732742752762772782792802812822832842852862872882892902912922932942952962972982993003013023033043053063073083093103113123133143153163173183193203213223233243253263273283293303313323333343353363373383393403413423433443453463473483493503513523533543553563573583593603613623633643653663673683693703713723733743753763773783793803813823833843853863873883893903913923933943953963973983994004014024034044054064074084094104114124134144154164174184194204214224234244254264274284294304314324334344354364374384394404414424434444454464474484494504514524534544554564574584594604614624634644654664674684694704714724734744754764774784794804814824834844854864874884894904914924934944954964974984995005015025035045055065075085095105115125135145155165175185195205215225235245255265275285295305315325335345355365375385395405415425435445455465475485495505515525535545555565575585595605615625635645655665675685695705715725735745755765775785795805815825835845855865875885895905915925935945955965975985996006016026036046056066076086096106116126136146156166176186196206216226236246256266276286296306316326336346356366376386396406416426436446456466476486496506516526536546556566576586596606616626636646656666676686696706716726736746756766776786796806816826836846856866876886896906916926936946956966976986997007017027037047057067077087097107117127137147157167177187197207217227237247257267277287297307317327337347357367377387397407417427437447457467477487497507517527537547557567577587597607617627637647657667677687697707717727737747757767777787797807817827837847857867877887897907917927937947957967977987998008018028038048058068078088098108118128138148158168178188198208218228238248258268278288298308318328338348358368378388398408418428438448458468478488498508518528538548558568578588598608618628638648658668678688698708718728738748758768778788798808818828838848858868878888898908918928938948958968978988999009019029039049059069079089099109119129139149159169179189199209219229239249259269279289299309319329339349359369379389399409419429439449459469479489499509519529539549559569579589599609619629639649659669679689699709719729739749759769779789799809819829839849859869879889899909919929939949959969979989991000100110021003100410051006100710081009101010111012101310141015101610171018101910201021102210231024102510261027102810291030103110321033103410351036103710381039104010411042104310441045104610471048104910501051105210531054105510561057105810591060106110621063106410651066106710681069107010711072107310741075107610771078107910801081108210831084108510861087108810891090109110921093109410951096109710981099110011011102110311041105110611071108110911101111111211131114111511161117111811191120112111221123112411251126112711281129113011311132113311341135113611371138113911401141114211431144114511461147114811491150115111521153115411551156115711581159116011611162116311641165116611671168116911701171117211731174117511761177117811791180118111821183118411851186118711881189119011911192119311941195119611971198119912001201120212031204120512061207120812091210121112121213121412151216121712181219122012211222122312241225122612271228122912301231123212331234123512361237123812391240124112421243124412451246124712481249125012511252125312541255125612571258125912601261126212631264126512661267126812691270127112721273127412751276127712781279128012811282128312841285128612871288128912901291129212931294129512961297129812991300130113021303130413051306130713081309131013111312131313141315131613171318131913201321132213231324132513261327132813291330133113321333133413351336133713381339134013411342134313441345134613471348134913501351135213531354135513561357135813591360136113621363136413651366136713681369137013711372137313741375137613771378137913801381138213831384138513861387138813891390139113921393139413951396139713981399140014011402140314041405140614071408140914101411141214131414141514161417141814191420142114221423142414251426142714281429143014311432143314341435143614371438143914401441144214431444144514461447144814491450145114521453145414551456145714581459146014611462146314641465146614671468146914701471147214731474147514761477147814791480148114821483148414851486148714881489149014911492149314941495149614971498149915001501150215031504150515061507150815091510151115121513151415151516151715181519152015211522152315241525152615271528152915301531153215331534153515361537153815391540154115421543154415451546154715481549155015511552155315541555155615571558155915601561156215631564156515661567156815691570157115721573157415751576157715781579158015811582158315841585158615871588158915901591159215931594159515961597159815991600160116021603160416051606160716081609161016111612161316141615161616171618161916201621162216231624162516261627162816291630163116321633163416351636163716381639164016411642164316441645164616471648164916501651165216531654165516561657165816591660166116621663166416651666166716681669167016711672167316741675167616771678167916801681168216831684168516861687168816891690169116921693169416951696169716981699170017011702170317041705170617071708170917101711171217131714171517161717171817191720172117221723172417251726172717281729173017311732173317341735173617371738173917401741174217431744174517461747174817491750175117521753175417551756175717581759176017611762176317641765176617671768176917701771177217731774177517761777177817791780178117821783178417851786178717881789179017911792179317941795179617971798179918001801180218031804180518061807180818091810181118121813181418151816181718181819182018211822182318241825182618271828182918301831183218331834183518361837183818391840184118421843184418451846184718481849185018511852185318541855185618571858185918601861186218631864186518661867186818691870187118721873187418751876187718781879188018811882188318841885188618871888188918901891189218931894189518961897189818991900190119021903190419051906190719081909191019111912191319141915191619171918191919201921192219231924192519261927192819291930193119321933193419351936193719381939194019411942194319441945194619471948194919501951195219531954195519561957195819591960196119621963196419651966196719681969197019711972197319741975197619771978197919801981198219831984198519861987198819891990199119921993199419951996199719981999200020012002200320042005200620072008200920102011201220132014201520162017201820192020202120222023202420252026202720282029203020312032203320342035203620372038203920402041204220432044204520462047204820492050205120522053205420552056205720582059206020612062206320642065206620672068206920702071207220732074207520762077207820792080208120822083208420852086208720882089209020912092209320942095209620972098209921002101210221032104210521062107210821092110211121122113211421152116211721182119212021212122212321242125212621272128212921302131213221332134213521362137213821392140214121422143214421452146214721482149215021512152215321542155215621572158215921602161216221632164216521662167216821692170217121722173217421752176217721782179218021812182218321842185218621872188218921902191219221932194219521962197219821992200220122022203220422052206220722082209221022112212221322142215221622172218221922202221222222232224222522262227222822292230223122322233223422352236223722382239224022412242224322442245224622472248224922502251225222532254225522562257225822592260226122622263226422652266226722682269227022712272227322742275227622772278227922802281228222832284228522862287228822892290229122922293229422952296229722982299230023012302230323042305230623072308230923102311231223132314231523162317231823192320232123222323232423252326232723282329233023312332233323342335233623372338233923402341234223432344234523462347234823492350235123522353235423552356235723582359236023612362236323642365236623672368236923702371237223732374237523762377237823792380238123822383238423852386238723882389239023912392239323942395239623972398239924002401240224032404240524062407240824092410241124122413241424152416241724182419242024212422242324242425242624272428242924302431243224332434243524362437243824392440244124422443244424452446244724482449245024512452245324542455245624572458245924602461246224632464246524662467246824692470247124722473247424752476247724782479248024812482248324842485248624872488248924902491249224932494249524962497249824992500250125022503250425052506250725082509251025112512251325142515251625172518251925202521252225232524252525262527252825292530253125322533253425352536253725382539254025412542254325442545254625472548254925502551255225532554255525562557255825592560256125622563256425652566256725682569257025712572257325742575257625772578257925802581258225832584258525862587258825892590259125922593259425952596259725982599260026012602260326042605260626072608260926102611261226132614261526162617261826192620262126222623262426252626262726282629263026312632263326342635263626372638263926402641264226432644264526462647264826492650265126522653265426552656265726582659266026612662266326642665266626672668266926702671267226732674267526762677267826792680268126822683268426852686268726882689269026912692269326942695269626972698269927002701270227032704270527062707270827092710271127122713271427152716271727182719272027212722272327242725272627272728272927302731273227332734273527362737273827392740274127422743274427452746274727482749275027512752275327542755275627572758275927602761276227632764276527662767276827692770277127722773277427752776277727782779278027812782278327842785278627872788278927902791279227932794279527962797279827992800280128022803280428052806280728082809281028112812281328142815281628172818281928202821282228232824282528262827282828292830283128322833283428352836283728382839284028412842284328442845284628472848284928502851285228532854285528562857285828592860286128622863286428652866286728682869287028712872287328742875287628772878287928802881288228832884288528862887288828892890289128922893289428952896289728982899290029012902290329042905290629072908290929102911291229132914291529162917291829192920292129222923292429252926292729282929293029312932293329342935293629372938293929402941294229432944294529462947294829492950295129522953295429552956295729582959296029612962296329642965296629672968296929702971297229732974297529762977297829792980298129822983298429852986298729882989299029912992299329942995299629972998299930003001300230033004300530063007300830093010301130123013301430153016301730183019302030213022302330243025302630273028302930303031303230333034303530363037303830393040304130423043304430453046304730483049305030513052305330543055305630573058305930603061306230633064306530663067306830693070307130723073307430753076307730783079308030813082308330843085308630873088308930903091309230933094309530963097309830993100310131023103310431053106310731083109311031113112311331143115311631173118311931203121312231233124312531263127312831293130313131323133313431353136313731383139314031413142314331443145314631473148314931503151315231533154315531563157315831593160316131623163316431653166316731683169317031713172317331743175317631773178317931803181318231833184318531863187318831893190319131923193319431953196319731983199320032013202320332043205320632073208320932103211321232133214321532163217321832193220322132223223322432253226322732283229323032313232323332343235323632373238323932403241324232433244324532463247324832493250325132523253325432553256325732583259326032613262326332643265326632673268326932703271327232733274327532763277327832793280328132823283328432853286328732883289329032913292329332943295329632973298329933003301330233033304330533063307330833093310331133123313331433153316331733183319332033213322332333243325332633273328332933303331333233333334333533363337333833393340334133423343334433453346334733483349335033513352335333543355335633573358335933603361336233633364336533663367336833693370337133723373337433753376337733783379338033813382338333843385338633873388338933903391339233933394339533963397339833993400340134023403340434053406340734083409341034113412341334143415341634173418341934203421342234233424342534263427342834293430343134323433343434353436343734383439344034413442344334443445344634473448344934503451345234533454345534563457345834593460346134623463346434653466346734683469347034713472347334743475347634773478347934803481348234833484348534863487348834893490349134923493349434953496349734983499350035013502350335043505350635073508350935103511351235133514351535163517351835193520352135223523352435253526352735283529353035313532353335343535353635373538353935403541354235433544354535463547354835493550355135523553355435553556355735583559356035613562356335643565356635673568356935703571357235733574357535763577357835793580358135823583358435853586358735883589359035913592359335943595359635973598359936003601360236033604360536063607360836093610361136123613361436153616361736183619362036213622362336243625362636273628362936303631363236333634363536363637363836393640364136423643364436453646364736483649365036513652365336543655365636573658365936603661366236633664366536663667366836693670367136723673367436753676367736783679368036813682368336843685368636873688368936903691369236933694369536963697369836993700370137023703370437053706370737083709371037113712371337143715371637173718371937203721372237233724372537263727372837293730373137323733373437353736373737383739374037413742374337443745374637473748374937503751375237533754375537563757375837593760376137623763376437653766376737683769377037713772377337743775377637773778377937803781378237833784378537863787378837893790379137923793379437953796379737983799380038013802380338043805380638073808380938103811381238133814381538163817381838193820382138223823382438253826382738283829383038313832383338343835383638373838383938403841384238433844384538463847384838493850385138523853385438553856385738583859386038613862386338643865386638673868386938703871387238733874387538763877387838793880388138823883388438853886388738883889389038913892389338943895389638973898389939003901390239033904390539063907390839093910391139123913391439153916391739183919392039213922392339243925392639273928392939303931393239333934393539363937393839393940394139423943394439453946394739483949395039513952395339543955395639573958395939603961396239633964396539663967396839693970397139723973397439753976397739783979398039813982398339843985398639873988398939903991399239933994399539963997399839994000400140024003400440054006400740084009401040114012401340144015401640174018401940204021402240234024402540264027402840294030403140324033403440354036403740384039404040414042404340444045404640474048404940504051405240534054405540564057405840594060406140624063406440654066406740684069407040714072407340744075407640774078407940804081408240834084408540864087408840894090409140924093409440954096409740984099410041014102410341044105410641074108410941104111411241134114411541164117411841194120412141224123412441254126412741284129413041314132413341344135413641374138413941404141414241434144414541464147414841494150415141524153415441554156415741584159416041614162416341644165416641674168416941704171417241734174417541764177417841794180418141824183418441854186418741884189419041914192419341944195419641974198419942004201420242034204420542064207420842094210421142124213421442154216421742184219422042214222422342244225422642274228422942304231423242334234423542364237423842394240424142424243424442454246424742484249425042514252425342544255425642574258425942604261426242634264426542664267426842694270427142724273427442754276427742784279428042814282428342844285428642874288428942904291429242934294429542964297429842994300430143024303430443054306430743084309431043114312431343144315431643174318431943204321432243234324432543264327432843294330433143324333433443354336433743384339434043414342434343444345434643474348434943504351435243534354435543564357435843594360436143624363436443654366436743684369437043714372437343744375437643774378437943804381438243834384438543864387438843894390439143924393439443954396439743984399440044014402440344044405440644074408440944104411441244134414441544164417441844194420442144224423442444254426442744284429443044314432443344344435443644374438443944404441444244434444444544464447444844494450445144524453445444554456445744584459446044614462446344644465446644674468446944704471447244734474447544764477447844794480448144824483448444854486448744884489449044914492449344944495449644974498449945004501450245034504450545064507450845094510451145124513451445154516451745184519452045214522452345244525452645274528452945304531453245334534453545364537453845394540454145424543454445454546454745484549455045514552455345544555455645574558455945604561456245634564456545664567456845694570457145724573457445754576457745784579458045814582458345844585458645874588458945904591459245934594459545964597459845994600460146024603460446054606460746084609461046114612461346144615461646174618461946204621462246234624462546264627462846294630463146324633463446354636463746384639464046414642464346444645464646474648464946504651465246534654465546564657465846594660466146624663466446654666466746684669467046714672467346744675467646774678467946804681468246834684468546864687468846894690469146924693469446954696469746984699470047014702470347044705470647074708470947104711471247134714471547164717471847194720472147224723472447254726472747284729473047314732473347344735473647374738473947404741474247434744474547464747474847494750475147524753475447554756475747584759476047614762476347644765476647674768476947704771477247734774477547764777477847794780478147824783478447854786478747884789479047914792479347944795479647974798479948004801480248034804480548064807480848094810481148124813481448154816481748184819482048214822482348244825482648274828482948304831483248334834483548364837483848394840484148424843484448454846484748484849485048514852485348544855485648574858485948604861486248634864486548664867486848694870487148724873487448754876487748784879488048814882488348844885488648874888488948904891489248934894489548964897489848994900490149024903490449054906490749084909491049114912491349144915491649174918491949204921492249234924492549264927492849294930493149324933493449354936493749384939494049414942494349444945494649474948494949504951495249534954495549564957495849594960496149624963496449654966496749684969497049714972497349744975497649774978497949804981498249834984498549864987498849894990499149924993499449954996499749984999500050015002500350045005500650075008500950105011501250135014501550165017501850195020502150225023502450255026502750285029503050315032503350345035503650375038503950405041504250435044504550465047504850495050505150525053505450555056505750585059506050615062506350645065506650675068506950705071507250735074507550765077507850795080508150825083508450855086508750885089509050915092509350945095509650975098509951005101510251035104510551065107510851095110511151125113511451155116511751185119512051215122512351245125512651275128512951305131513251335134513551365137513851395140514151425143514451455146514751485149515051515152515351545155515651575158515951605161516251635164516551665167516851695170517151725173517451755176517751785179518051815182518351845185518651875188518951905191519251935194519551965197519851995200520152025203520452055206520752085209521052115212521352145215521652175218521952205221522252235224522552265227522852295230523152325233523452355236523752385239524052415242524352445245524652475248524952505251525252535254525552565257525852595260526152625263526452655266526752685269527052715272527352745275527652775278527952805281528252835284528552865287528852895290529152925293529452955296529752985299530053015302530353045305530653075308530953105311531253135314531553165317531853195320532153225323532453255326532753285329533053315332533353345335533653375338533953405341534253435344534553465347534853495350535153525353535453555356535753585359536053615362536353645365536653675368536953705371537253735374537553765377537853795380538153825383538453855386538753885389539053915392539353945395539653975398539954005401540254035404540554065407540854095410541154125413541454155416541754185419542054215422542354245425542654275428542954305431543254335434543554365437543854395440544154425443544454455446544754485449545054515452545354545455545654575458545954605461546254635464546554665467546854695470547154725473547454755476547754785479548054815482548354845485548654875488548954905491549254935494549554965497549854995500550155025503550455055506550755085509551055115512551355145515551655175518551955205521552255235524552555265527552855295530553155325533553455355536553755385539554055415542554355445545554655475548554955505551555255535554555555565557555855595560556155625563556455655566556755685569557055715572557355745575557655775578557955805581558255835584558555865587558855895590559155925593559455955596559755985599560056015602560356045605560656075608560956105611561256135614561556165617561856195620562156225623562456255626562756285629563056315632563356345635563656375638563956405641564256435644564556465647564856495650565156525653565456555656565756585659566056615662566356645665566656675668566956705671567256735674567556765677567856795680568156825683568456855686568756885689569056915692569356945695569656975698569957005701570257035704570557065707570857095710571157125713571457155716571757185719572057215722572357245725572657275728572957305731573257335734573557365737573857395740574157425743574457455746574757485749575057515752575357545755575657575758575957605761576257635764576557665767576857695770577157725773577457755776577757785779578057815782578357845785578657875788578957905791579257935794579557965797579857995800580158025803580458055806580758085809581058115812581358145815581658175818581958205821582258235824582558265827582858295830583158325833583458355836583758385839584058415842584358445845584658475848584958505851585258535854585558565857585858595860586158625863586458655866586758685869587058715872587358745875587658775878587958805881588258835884588558865887588858895890589158925893589458955896589758985899590059015902590359045905590659075908590959105911591259135914591559165917591859195920592159225923592459255926592759285929593059315932593359345935593659375938593959405941594259435944594559465947594859495950595159525953595459555956595759585959596059615962596359645965596659675968596959705971597259735974597559765977597859795980598159825983598459855986598759885989599059915992599359945995599659975998599960006001600260036004600560066007600860096010601160126013601460156016601760186019602060216022602360246025602660276028602960306031603260336034603560366037603860396040604160426043604460456046604760486049605060516052605360546055605660576058605960606061606260636064606560666067606860696070607160726073607460756076607760786079608060816082608360846085608660876088608960906091609260936094609560966097609860996100610161026103610461056106610761086109611061116112611361146115611661176118611961206121612261236124612561266127612861296130613161326133613461356136613761386139614061416142614361446145614661476148614961506151615261536154615561566157615861596160616161626163616461656166616761686169617061716172617361746175617661776178617961806181618261836184618561866187618861896190619161926193619461956196619761986199620062016202620362046205620662076208620962106211621262136214621562166217621862196220622162226223622462256226622762286229623062316232623362346235623662376238623962406241624262436244624562466247624862496250625162526253625462556256625762586259626062616262626362646265626662676268626962706271627262736274627562766277627862796280628162826283628462856286628762886289629062916292629362946295629662976298629963006301630263036304630563066307630863096310631163126313631463156316631763186319632063216322632363246325632663276328632963306331633263336334633563366337633863396340634163426343634463456346634763486349635063516352635363546355635663576358635963606361636263636364636563666367636863696370637163726373637463756376637763786379638063816382638363846385638663876388638963906391639263936394639563966397639863996400640164026403640464056406640764086409641064116412641364146415641664176418641964206421642264236424642564266427642864296430643164326433643464356436643764386439644064416442644364446445644664476448644964506451645264536454645564566457645864596460646164626463646464656466646764686469647064716472647364746475647664776478647964806481648264836484648564866487648864896490649164926493649464956496649764986499650065016502650365046505650665076508650965106511651265136514651565166517651865196520652165226523652465256526652765286529653065316532653365346535653665376538653965406541654265436544654565466547654865496550655165526553655465556556655765586559656065616562656365646565656665676568656965706571657265736574657565766577657865796580658165826583658465856586658765886589659065916592659365946595659665976598659966006601660266036604660566066607660866096610661166126613661466156616661766186619662066216622662366246625662666276628662966306631663266336634663566366637663866396640664166426643664466456646664766486649665066516652665366546655665666576658665966606661666266636664666566666667666866696670667166726673667466756676667766786679668066816682668366846685668666876688668966906691669266936694669566966697669866996700670167026703670467056706670767086709671067116712671367146715671667176718671967206721672267236724672567266727672867296730673167326733673467356736673767386739674067416742674367446745674667476748674967506751675267536754675567566757675867596760676167626763676467656766676767686769677067716772677367746775677667776778677967806781678267836784678567866787678867896790679167926793679467956796679767986799680068016802680368046805680668076808680968106811681268136814681568166817681868196820682168226823682468256826682768286829683068316832683368346835683668376838683968406841684268436844684568466847684868496850685168526853685468556856685768586859686068616862686368646865686668676868686968706871687268736874687568766877687868796880688168826883688468856886688768886889689068916892689368946895689668976898689969006901690269036904690569066907690869096910691169126913691469156916691769186919692069216922692369246925692669276928692969306931693269336934693569366937693869396940694169426943694469456946694769486949695069516952695369546955695669576958695969606961696269636964696569666967696869696970697169726973697469756976697769786979698069816982698369846985698669876988698969906991699269936994699569966997699869997000700170027003700470057006700770087009701070117012701370147015701670177018701970207021702270237024702570267027702870297030703170327033703470357036703770387039704070417042704370447045704670477048704970507051705270537054705570567057705870597060706170627063706470657066706770687069707070717072707370747075707670777078707970807081708270837084708570867087708870897090709170927093709470957096709770987099710071017102710371047105710671077108710971107111711271137114711571167117711871197120712171227123712471257126712771287129713071317132713371347135713671377138713971407141714271437144714571467147714871497150715171527153715471557156715771587159716071617162716371647165716671677168716971707171717271737174717571767177717871797180718171827183718471857186718771887189719071917192719371947195719671977198719972007201720272037204720572067207720872097210721172127213721472157216721772187219722072217222722372247225722672277228722972307231723272337234723572367237723872397240724172427243724472457246724772487249725072517252725372547255725672577258725972607261726272637264726572667267726872697270727172727273727472757276727772787279728072817282728372847285728672877288728972907291729272937294729572967297729872997300730173027303730473057306730773087309731073117312731373147315731673177318731973207321732273237324732573267327732873297330733173327333733473357336733773387339734073417342734373447345734673477348734973507351735273537354735573567357735873597360736173627363736473657366736773687369737073717372737373747375737673777378737973807381738273837384738573867387738873897390739173927393739473957396739773987399740074017402740374047405740674077408740974107411741274137414741574167417741874197420742174227423742474257426742774287429743074317432743374347435743674377438743974407441744274437444744574467447744874497450745174527453745474557456745774587459746074617462746374647465746674677468746974707471747274737474747574767477747874797480748174827483748474857486748774887489749074917492749374947495749674977498749975007501750275037504750575067507750875097510751175127513751475157516751775187519752075217522752375247525752675277528752975307531753275337534753575367537753875397540754175427543754475457546754775487549755075517552755375547555755675577558755975607561756275637564756575667567756875697570757175727573757475757576757775787579758075817582758375847585758675877588758975907591759275937594759575967597759875997600760176027603760476057606760776087609761076117612761376147615761676177618761976207621762276237624762576267627762876297630763176327633763476357636763776387639764076417642764376447645764676477648764976507651765276537654765576567657765876597660766176627663766476657666766776687669767076717672767376747675767676777678767976807681768276837684768576867687768876897690769176927693769476957696769776987699770077017702770377047705770677077708770977107711771277137714771577167717771877197720772177227723772477257726772777287729773077317732773377347735773677377738773977407741774277437744774577467747774877497750775177527753775477557756775777587759776077617762776377647765776677677768776977707771777277737774777577767777777877797780778177827783778477857786778777887789779077917792779377947795779677977798779978007801780278037804780578067807780878097810781178127813781478157816781778187819782078217822782378247825782678277828782978307831783278337834783578367837783878397840784178427843784478457846784778487849785078517852785378547855785678577858785978607861786278637864786578667867786878697870787178727873787478757876787778787879788078817882788378847885788678877888788978907891789278937894789578967897789878997900790179027903790479057906790779087909791079117912791379147915791679177918791979207921792279237924792579267927792879297930793179327933793479357936793779387939794079417942794379447945794679477948794979507951795279537954795579567957795879597960796179627963796479657966796779687969797079717972797379747975797679777978797979807981798279837984798579867987798879897990799179927993799479957996799779987999800080018002800380048005800680078008800980108011801280138014801580168017801880198020802180228023802480258026802780288029803080318032803380348035803680378038803980408041804280438044804580468047804880498050805180528053805480558056805780588059806080618062806380648065806680678068806980708071807280738074807580768077807880798080808180828083808480858086808780888089809080918092809380948095809680978098809981008101810281038104810581068107810881098110811181128113811481158116811781188119812081218122812381248125812681278128812981308131813281338134813581368137813881398140814181428143814481458146814781488149815081518152815381548155815681578158815981608161816281638164816581668167816881698170817181728173817481758176817781788179818081818182818381848185818681878188818981908191819281938194819581968197819881998200820182028203820482058206820782088209821082118212821382148215821682178218821982208221822282238224822582268227822882298230823182328233823482358236823782388239824082418242824382448245824682478248824982508251825282538254825582568257825882598260826182628263826482658266826782688269827082718272827382748275827682778278827982808281828282838284828582868287828882898290829182928293829482958296829782988299830083018302830383048305830683078308830983108311831283138314831583168317831883198320832183228323832483258326832783288329833083318332833383348335833683378338833983408341834283438344834583468347834883498350835183528353835483558356835783588359836083618362836383648365836683678368836983708371837283738374837583768377837883798380838183828383838483858386838783888389839083918392839383948395839683978398839984008401840284038404840584068407840884098410841184128413841484158416841784188419842084218422842384248425842684278428842984308431843284338434843584368437843884398440844184428443844484458446844784488449845084518452845384548455845684578458845984608461846284638464846584668467846884698470847184728473847484758476
  1. \# --------------------------------------------------------------------------
  2. \#
  3. \# Copyright 1996-2018 The NASM Authors - All Rights Reserved
  4. \# See the file AUTHORS included with the NASM distribution for
  5. \# the specific copyright holders.
  6. \#
  7. \# Redistribution and use in source and binary forms, with or without
  8. \# modification, are permitted provided that the following
  9. \# conditions are met:
  10. \#
  11. \# * Redistributions of source code must retain the above copyright
  12. \# notice, this list of conditions and the following disclaimer.
  13. \# * Redistributions in binary form must reproduce the above
  14. \# copyright notice, this list of conditions and the following
  15. \# disclaimer in the documentation and/or other materials provided
  16. \# with the distribution.
  17. \#
  18. \# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  19. \# CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  20. \# INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
  21. \# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
  22. \# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  23. \# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  24. \# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  25. \# NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  26. \# LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  27. \# HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
  28. \# CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
  29. \# OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
  30. \# EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  31. \#
  32. \# --------------------------------------------------------------------------
  33. \#
  34. \# Source code to NASM documentation
  35. \#
  36. \M{category}{Programming}
  37. \M{title}{NASM - The Netwide Assembler}
  38. \M{year}{1996-2017}
  39. \M{author}{The NASM Development Team}
  40. \M{copyright_tail}{-- All Rights Reserved}
  41. \M{license}{This document is redistributable under the license given in the file "LICENSE" distributed in the NASM archive.}
  42. \M{summary}{This file documents NASM, the Netwide Assembler: an assembler targetting the Intel x86 series of processors, with portable source.}
  43. \M{infoname}{NASM}
  44. \M{infofile}{nasm}
  45. \M{infotitle}{The Netwide Assembler for x86}
  46. \M{epslogo}{nasmlogo.eps}
  47. \M{logoyadj}{-72}
  48. \& version.src
  49. \IR{-D} \c{-D} option
  50. \IR{-E} \c{-E} option
  51. \IR{-F} \c{-F} option
  52. \IR{-I} \c{-I} option
  53. \IR{-M} \c{-M} option
  54. \IR{-MD} \c{-MD} option
  55. \IR{-MF} \c{-MF} option
  56. \IR{-MG} \c{-MG} option
  57. \IR{-MP} \c{-MP} option
  58. \IR{-MQ} \c{-MQ} option
  59. \IR{-MT} \c{-MT} option
  60. \IR{-MW} \c{-MW} option
  61. \IR{-O} \c{-O} option
  62. \IR{-P} \c{-P} option
  63. \IR{-U} \c{-U} option
  64. \IR{-X} \c{-X} option
  65. \IR{-a} \c{-a} option
  66. \IR{-d} \c{-d} option
  67. \IR{-e} \c{-e} option
  68. \IR{-f} \c{-f} option
  69. \IR{-g} \c{-g} option
  70. \IR{-i} \c{-i} option
  71. \IR{-l} \c{-l} option
  72. \IR{-o} \c{-o} option
  73. \IR{-p} \c{-p} option
  74. \IR{-s} \c{-s} option
  75. \IR{-u} \c{-u} option
  76. \IR{-v} \c{-v} option
  77. \IR{-W} \c{-W} option
  78. \IR{-Werror} \c{-Werror} option
  79. \IR{-Wno-error} \c{-Wno-error} option
  80. \IR{-w} \c{-w} option
  81. \IR{-y} \c{-y} option
  82. \IR{-Z} \c{-Z} option
  83. \IR{!=} \c{!=} operator
  84. \IR{$, here} \c{$}, Here token
  85. \IR{$, prefix} \c{$}, prefix
  86. \IR{$$} \c{$$} token
  87. \IR{%} \c{%} operator
  88. \IR{%%} \c{%%} operator
  89. \IR{%+1} \c{%+1} and \c{%-1} syntax
  90. \IA{%-1}{%+1}
  91. \IR{%0} \c{%0} parameter count
  92. \IR{&} \c{&} operator
  93. \IR{&&} \c{&&} operator
  94. \IR{*} \c{*} operator
  95. \IR{..@} \c{..@} symbol prefix
  96. \IR{/} \c{/} operator
  97. \IR{//} \c{//} operator
  98. \IR{<} \c{<} operator
  99. \IR{<<} \c{<<} operator
  100. \IR{<=} \c{<=} operator
  101. \IR{<>} \c{<>} operator
  102. \IR{=} \c{=} operator
  103. \IR{==} \c{==} operator
  104. \IR{>} \c{>} operator
  105. \IR{>=} \c{>=} operator
  106. \IR{>>} \c{>>} operator
  107. \IR{?} \c{?} MASM syntax
  108. \IR{^} \c{^} operator
  109. \IR{^^} \c{^^} operator
  110. \IR{|} \c{|} operator
  111. \IR{||} \c{||} operator
  112. \IR{~} \c{~} operator
  113. \IR{%$} \c{%$} and \c{%$$} prefixes
  114. \IA{%$$}{%$}
  115. \IR{+ opaddition} \c{+} operator, binary
  116. \IR{+ opunary} \c{+} operator, unary
  117. \IR{+ modifier} \c{+} modifier
  118. \IR{- opsubtraction} \c{-} operator, binary
  119. \IR{- opunary} \c{-} operator, unary
  120. \IR{! opunary} \c{!} operator, unary
  121. \IR{alignment, in bin sections} alignment, in \c{bin} sections
  122. \IR{alignment, in elf sections} alignment, in \c{elf} sections
  123. \IR{alignment, in win32 sections} alignment, in \c{win32} sections
  124. \IR{alignment, of elf common variables} alignment, of \c{elf} common
  125. variables
  126. \IR{alignment, in obj sections} alignment, in \c{obj} sections
  127. \IR{a.out, bsd version} \c{a.out}, BSD version
  128. \IR{a.out, linux version} \c{a.out}, Linux version
  129. \IR{autoconf} Autoconf
  130. \IR{bin} bin
  131. \IR{bitwise and} bitwise AND
  132. \IR{bitwise or} bitwise OR
  133. \IR{bitwise xor} bitwise XOR
  134. \IR{block ifs} block IFs
  135. \IR{borland pascal} Borland, Pascal
  136. \IR{borland's win32 compilers} Borland, Win32 compilers
  137. \IR{braces, after % sign} braces, after \c{%} sign
  138. \IR{bsd} BSD
  139. \IR{c calling convention} C calling convention
  140. \IR{c symbol names} C symbol names
  141. \IA{critical expressions}{critical expression}
  142. \IA{command line}{command-line}
  143. \IA{case sensitivity}{case sensitive}
  144. \IA{case-sensitive}{case sensitive}
  145. \IA{case-insensitive}{case sensitive}
  146. \IA{character constants}{character constant}
  147. \IR{codeview} CodeView debugging format
  148. \IR{common object file format} Common Object File Format
  149. \IR{common variables, alignment in elf} common variables, alignment
  150. in \c{elf}
  151. \IR{common, elf extensions to} \c{COMMON}, \c{elf} extensions to
  152. \IR{common, obj extensions to} \c{COMMON}, \c{obj} extensions to
  153. \IR{declaring structure} declaring structures
  154. \IR{default-wrt mechanism} default-\c{WRT} mechanism
  155. \IR{devpac} DevPac
  156. \IR{djgpp} DJGPP
  157. \IR{dll symbols, exporting} DLL symbols, exporting
  158. \IR{dll symbols, importing} DLL symbols, importing
  159. \IR{dos} DOS
  160. \IR{dos archive} DOS archive
  161. \IR{dos source archive} DOS source archive
  162. \IA{effective address}{effective addresses}
  163. \IA{effective-address}{effective addresses}
  164. \IR{elf} ELF
  165. \IR{elf, 16-bit code and} ELF, 16-bit code and
  166. \IR{elf shared libraries} ELF, shared libraries
  167. \IR{elf32} \c{elf32}
  168. \IR{elf64} \c{elf64}
  169. \IR{elfx32} \c{elfx32}
  170. \IR{executable and linkable format} Executable and Linkable Format
  171. \IR{extern, obj extensions to} \c{EXTERN}, \c{obj} extensions to
  172. \IR{extern, rdf extensions to} \c{EXTERN}, \c{rdf} extensions to
  173. \IR{floating-point, constants} floating-point, constants
  174. \IR{floating-point, packed bcd constants} floating-point, packed BCD constants
  175. \IR{freebsd} FreeBSD
  176. \IR{freelink} FreeLink
  177. \IR{functions, c calling convention} functions, C calling convention
  178. \IR{functions, pascal calling convention} functions, Pascal calling
  179. convention
  180. \IR{global, aoutb extensions to} \c{GLOBAL}, \c{aoutb} extensions to
  181. \IR{global, elf extensions to} \c{GLOBAL}, \c{elf} extensions to
  182. \IR{global, rdf extensions to} \c{GLOBAL}, \c{rdf} extensions to
  183. \IR{got} GOT
  184. \IR{got relocations} \c{GOT} relocations
  185. \IR{gotoff relocation} \c{GOTOFF} relocations
  186. \IR{gotpc relocation} \c{GOTPC} relocations
  187. \IR{intel number formats} Intel number formats
  188. \IR{linux, elf} Linux, ELF
  189. \IR{linux, a.out} Linux, \c{a.out}
  190. \IR{linux, as86} Linux, \c{as86}
  191. \IR{logical and} logical AND
  192. \IR{logical or} logical OR
  193. \IR{logical xor} logical XOR
  194. \IR{mach object file format} Mach, object file format
  195. \IA{mach-o}{macho}
  196. \IR{mach-o} Mach-O, object file format
  197. \IR{macho32} \c{macho32}
  198. \IR{macho64} \c{macho64}
  199. \IR{macos x} MacOS X
  200. \IR{masm} MASM
  201. \IA{memory reference}{memory references}
  202. \IR{minix} Minix
  203. \IA{misc directory}{misc subdirectory}
  204. \IR{misc subdirectory} \c{misc} subdirectory
  205. \IR{microsoft omf} Microsoft OMF
  206. \IR{mmx registers} MMX registers
  207. \IA{modr/m}{modr/m byte}
  208. \IR{modr/m byte} ModR/M byte
  209. \IR{ms-dos} MS-DOS
  210. \IR{ms-dos device drivers} MS-DOS device drivers
  211. \IR{multipush} \c{multipush} macro
  212. \IR{nan} NaN
  213. \IR{nasm version} NASM version
  214. \IR{netbsd} NetBSD
  215. \IR{nsis} NSIS
  216. \IR{nullsoft scriptable installer} Nullsoft Scriptable Installer
  217. \IR{omf} OMF
  218. \IR{openbsd} OpenBSD
  219. \IR{operating system} operating system
  220. \IR{os/2} OS/2
  221. \IR{pascal calling convention}Pascal calling convention
  222. \IR{passes} passes, assembly
  223. \IR{perl} Perl
  224. \IR{pic} PIC
  225. \IR{pharlap} PharLap
  226. \IR{plt} PLT
  227. \IR{plt} \c{PLT} relocations
  228. \IA{pre-defining macros}{pre-define}
  229. \IA{preprocessor expressions}{preprocessor, expressions}
  230. \IA{preprocessor loops}{preprocessor, loops}
  231. \IA{preprocessor variables}{preprocessor, variables}
  232. \IA{rdoff subdirectory}{rdoff}
  233. \IR{rdoff} \c{rdoff} subdirectory
  234. \IR{relocatable dynamic object file format} Relocatable Dynamic
  235. Object File Format
  236. \IR{relocations, pic-specific} relocations, PIC-specific
  237. \IA{repeating}{repeating code}
  238. \IR{section alignment, in elf} section alignment, in \c{elf}
  239. \IR{section alignment, in bin} section alignment, in \c{bin}
  240. \IR{section alignment, in obj} section alignment, in \c{obj}
  241. \IR{section alignment, in win32} section alignment, in \c{win32}
  242. \IR{section, elf extensions to} \c{SECTION}, \c{elf} extensions to
  243. \IR{section, macho extensions to} \c{SECTION}, \c{macho} extensions to
  244. \IR{section, win32 extensions to} \c{SECTION}, \c{win32} extensions to
  245. \IR{segment alignment, in bin} segment alignment, in \c{bin}
  246. \IR{segment alignment, in obj} segment alignment, in \c{obj}
  247. \IR{segment, obj extensions to} \c{SEGMENT}, \c{elf} extensions to
  248. \IR{segment names, borland pascal} segment names, Borland Pascal
  249. \IR{shift command} \c{shift} command
  250. \IA{sib}{sib byte}
  251. \IR{sib byte} SIB byte
  252. \IR{align, smart} \c{ALIGN}, smart
  253. \IA{sectalign}{sectalign}
  254. \IR{solaris x86} Solaris x86
  255. \IA{standard section names}{standardized section names}
  256. \IR{symbols, exporting from dlls} symbols, exporting from DLLs
  257. \IR{symbols, importing from dlls} symbols, importing from DLLs
  258. \IR{test subdirectory} \c{test} subdirectory
  259. \IR{tlink} \c{TLINK}
  260. \IR{underscore, in c symbols} underscore, in C symbols
  261. \IR{unicode} Unicode
  262. \IR{unix} Unix
  263. \IR{utf-8} UTF-8
  264. \IR{utf-16} UTF-16
  265. \IR{utf-32} UTF-32
  266. \IA{sco unix}{unix, sco}
  267. \IR{unix, sco} Unix, SCO
  268. \IA{unix source archive}{unix, source archive}
  269. \IR{unix, source archive} Unix, source archive
  270. \IA{unix system v}{unix, system v}
  271. \IR{unix, system v} Unix, System V
  272. \IR{unixware} UnixWare
  273. \IR{val} VAL
  274. \IR{version number of nasm} version number of NASM
  275. \IR{visual c++} Visual C++
  276. \IR{www page} WWW page
  277. \IR{win32} Win32
  278. \IR{win32} Win64
  279. \IR{windows} Windows
  280. \IR{windows 95} Windows 95
  281. \IR{windows nt} Windows NT
  282. \# \IC{program entry point}{entry point, program}
  283. \# \IC{program entry point}{start point, program}
  284. \# \IC{MS-DOS device drivers}{device drivers, MS-DOS}
  285. \# \IC{16-bit mode, versus 32-bit mode}{32-bit mode, versus 16-bit mode}
  286. \# \IC{c symbol names}{symbol names, in C}
  287. \C{intro} Introduction
  288. \H{whatsnasm} What Is NASM?
  289. The Netwide Assembler, NASM, is an 80x86 and x86-64 assembler designed
  290. for portability and modularity. It supports a range of object file
  291. formats, including Linux and \c{*BSD} \c{a.out}, \c{ELF}, \c{COFF},
  292. \c{Mach-O}, 16-bit and 32-bit \c{OBJ} (OMF) format, \c{Win32} and
  293. \c{Win64}. It will also output plain binary files, Intel hex and
  294. Motorola S-Record formats. Its syntax is designed to be simple and
  295. easy to understand, similar to the syntax in the Intel Software
  296. Developer Manual with minimal complexity. It supports all currently
  297. known x86 architectural extensions, and has strong support for macros.
  298. NASM also comes with a set of utilities for handling the \c{RDOFF}
  299. custom object-file format.
  300. \S{legal} \i{License} Conditions
  301. Please see the file \c{LICENSE}, supplied as part of any NASM
  302. distribution archive, for the license conditions under which you may
  303. use NASM. NASM is now under the so-called 2-clause BSD license, also
  304. known as the simplified BSD license.
  305. Copyright 1996-2017 the NASM Authors - All rights reserved.
  306. Redistribution and use in source and binary forms, with or without
  307. modification, are permitted provided that the following conditions are
  308. met:
  309. \b Redistributions of source code must retain the above copyright
  310. notice, this list of conditions and the following disclaimer.
  311. \b Redistributions in binary form must reproduce the above copyright
  312. notice, this list of conditions and the following disclaimer in the
  313. documentation and/or other materials provided with the distribution.
  314. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND
  315. CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES,
  316. INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
  317. MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
  318. DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
  319. CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
  320. SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
  321. NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
  322. LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
  323. HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
  324. CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR
  325. OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
  326. EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
  327. \C{running} Running NASM
  328. \H{syntax} NASM \i{Command-Line} Syntax
  329. To assemble a file, you issue a command of the form
  330. \c nasm -f <format> <filename> [-o <output>]
  331. For example,
  332. \c nasm -f elf myfile.asm
  333. will assemble \c{myfile.asm} into an \c{ELF} object file \c{myfile.o}. And
  334. \c nasm -f bin myfile.asm -o myfile.com
  335. will assemble \c{myfile.asm} into a raw binary file \c{myfile.com}.
  336. To produce a listing file, with the hex codes output from NASM
  337. displayed on the left of the original sources, use the \c{-l} option
  338. to give a listing file name, for example:
  339. \c nasm -f coff myfile.asm -l myfile.lst
  340. To get further usage instructions from NASM, try typing
  341. \c nasm -h
  342. The option \c{--help} is an alias for the \c{-h} option.
  343. The option \c{-hf} will also list the available output file formats,
  344. and what they are.
  345. If you use Linux but aren't sure whether your system is \c{a.out}
  346. or \c{ELF}, type
  347. \c file nasm
  348. (in the directory in which you put the NASM binary when you
  349. installed it). If it says something like
  350. \c nasm: ELF 32-bit LSB executable i386 (386 and up) Version 1
  351. then your system is \c{ELF}, and you should use the option \c{-f elf}
  352. when you want NASM to produce Linux object files. If it says
  353. \c nasm: Linux/i386 demand-paged executable (QMAGIC)
  354. or something similar, your system is \c{a.out}, and you should use
  355. \c{-f aout} instead (Linux \c{a.out} systems have long been obsolete,
  356. and are rare these days.)
  357. Like Unix compilers and assemblers, NASM is silent unless it
  358. goes wrong: you won't see any output at all, unless it gives error
  359. messages.
  360. \S{opt-o} The \i\c{-o} Option: Specifying the Output File Name
  361. NASM will normally choose the name of your output file for you;
  362. precisely how it does this is dependent on the object file format.
  363. For Microsoft object file formats (\c{obj}, \c{win32} and \c{win64}),
  364. it will remove the \c{.asm} \i{extension} (or whatever extension you
  365. like to use - NASM doesn't care) from your source file name and
  366. substitute \c{.obj}. For Unix object file formats (\c{aout}, \c{as86},
  367. \c{coff}, \c{elf32}, \c{elf64}, \c{elfx32}, \c{ieee}, \c{macho32} and
  368. \c{macho64}) it will substitute \c{.o}. For \c{dbg}, \c{rdf}, \c{ith}
  369. and \c{srec}, it will use \c{.dbg}, \c{.rdf}, \c{.ith} and \c{.srec},
  370. respectively, and for the \c{bin} format it will simply remove the
  371. extension, so that \c{myfile.asm} produces the output file \c{myfile}.
  372. If the output file already exists, NASM will overwrite it, unless it
  373. has the same name as the input file, in which case it will give a
  374. warning and use \i\c{nasm.out} as the output file name instead.
  375. For situations in which this behaviour is unacceptable, NASM
  376. provides the \c{-o} command-line option, which allows you to specify
  377. your desired output file name. You invoke \c{-o} by following it
  378. with the name you wish for the output file, either with or without
  379. an intervening space. For example:
  380. \c nasm -f bin program.asm -o program.com
  381. \c nasm -f bin driver.asm -odriver.sys
  382. Note that this is a small o, and is different from a capital O , which
  383. is used to specify the number of optimisation passes required. See \k{opt-O}.
  384. \S{opt-f} The \i\c{-f} Option: Specifying the \i{Output File Format}
  385. If you do not supply the \c{-f} option to NASM, it will choose an
  386. output file format for you itself. In the distribution versions of
  387. NASM, the default is always \i\c{bin}; if you've compiled your own
  388. copy of NASM, you can redefine \i\c{OF_DEFAULT} at compile time and
  389. choose what you want the default to be.
  390. Like \c{-o}, the intervening space between \c{-f} and the output
  391. file format is optional; so \c{-f elf} and \c{-felf} are both valid.
  392. A complete list of the available output file formats can be given by
  393. issuing the command \i\c{nasm -hf}.
  394. \S{opt-l} The \i\c{-l} Option: Generating a \i{Listing File}
  395. If you supply the \c{-l} option to NASM, followed (with the usual
  396. optional space) by a file name, NASM will generate a
  397. \i{source-listing file} for you, in which addresses and generated
  398. code are listed on the left, and the actual source code, with
  399. expansions of multi-line macros (except those which specifically
  400. request no expansion in source listings: see \k{nolist}) on the
  401. right. For example:
  402. \c nasm -f elf myfile.asm -l myfile.lst
  403. If a list file is selected, you may turn off listing for a
  404. section of your source with \c{[list -]}, and turn it back on
  405. with \c{[list +]}, (the default, obviously). There is no "user
  406. form" (without the brackets). This can be used to list only
  407. sections of interest, avoiding excessively long listings.
  408. \S{opt-M} The \i\c{-M} Option: Generate \i{Makefile Dependencies}
  409. This option can be used to generate makefile dependencies on stdout.
  410. This can be redirected to a file for further processing. For example:
  411. \c nasm -M myfile.asm > myfile.dep
  412. \S{opt-MG} The \i\c{-MG} Option: Generate \i{Makefile Dependencies}
  413. This option can be used to generate makefile dependencies on stdout.
  414. This differs from the \c{-M} option in that if a nonexisting file is
  415. encountered, it is assumed to be a generated file and is added to the
  416. dependency list without a prefix.
  417. \S{opt-MF} The \i\c\{-MF} Option: Set Makefile Dependency File
  418. This option can be used with the \c{-M} or \c{-MG} options to send the
  419. output to a file, rather than to stdout. For example:
  420. \c nasm -M -MF myfile.dep myfile.asm
  421. \S{opt-MD} The \i\c{-MD} Option: Assemble and Generate Dependencies
  422. The \c{-MD} option acts as the combination of the \c{-M} and \c{-MF}
  423. options (i.e. a filename has to be specified.) However, unlike the
  424. \c{-M} or \c{-MG} options, \c{-MD} does \e{not} inhibit the normal
  425. operation of the assembler. Use this to automatically generate
  426. updated dependencies with every assembly session. For example:
  427. \c nasm -f elf -o myfile.o -MD myfile.dep myfile.asm
  428. If the argument after \c{-MD} is an option rather than a filename,
  429. then the output filename is the first applicable one of:
  430. \b the filename set in the \c{-MF} option;
  431. \b the output filename from the \c{-o} option with \c{.d} appended;
  432. \b the input filename with the extension set to \c{.d}.
  433. \S{opt-MT} The \i\c{-MT} Option: Dependency Target Name
  434. The \c{-MT} option can be used to override the default name of the
  435. dependency target. This is normally the same as the output filename,
  436. specified by the \c{-o} option.
  437. \S{opt-MQ} The \i\c{-MQ} Option: Dependency Target Name (Quoted)
  438. The \c{-MQ} option acts as the \c{-MT} option, except it tries to
  439. quote characters that have special meaning in Makefile syntax. This
  440. is not foolproof, as not all characters with special meaning are
  441. quotable in Make. The default output (if no \c{-MT} or \c{-MQ} option
  442. is specified) is automatically quoted.
  443. \S{opt-MP} The \i\c{-MP} Option: Emit phony targets
  444. When used with any of the dependency generation options, the \c{-MP}
  445. option causes NASM to emit a phony target without dependencies for
  446. each header file. This prevents Make from complaining if a header
  447. file has been removed.
  448. \S{opt-MW} The \i\c{-MW} Option: Watcom Make quoting style
  449. This option causes NASM to attempt to quote dependencies according to
  450. Watcom Make conventions rather than POSIX Make conventions (also used
  451. by most other Make variants.) This quotes \c{#} as \c{$#} rather than
  452. \c{\\#}, uses \c{&} rather than \c{\\} for continuation lines, and
  453. encloses filenames containing whitespace in double quotes.
  454. \S{opt-F} The \i\c{-F} Option: Selecting a \i{Debug Information Format}
  455. This option is used to select the format of the debug information
  456. emitted into the output file, to be used by a debugger (or \e{will}
  457. be). Prior to version 2.03.01, the use of this switch did \e{not} enable
  458. output of the selected debug info format. Use \c{-g}, see \k{opt-g},
  459. to enable output. Versions 2.03.01 and later automatically enable \c{-g}
  460. if \c{-F} is specified.
  461. A complete list of the available debug file formats for an output
  462. format can be seen by issuing the command \c{nasm -f <format> -y}. Not
  463. all output formats currently support debugging output. See \k{opt-y}.
  464. This should not be confused with the \c{-f dbg} output format option,
  465. see \k{dbgfmt}.
  466. \S{opt-g} The \i\c{-g} Option: Enabling \i{Debug Information}.
  467. This option can be used to generate debugging information in the specified
  468. format. See \k{opt-F}. Using \c{-g} without \c{-F} results in emitting
  469. debug info in the default format, if any, for the selected output format.
  470. If no debug information is currently implemented in the selected output
  471. format, \c{-g} is \e{silently ignored}.
  472. \S{opt-X} The \i\c{-X} Option: Selecting an \i{Error Reporting Format}
  473. This option can be used to select an error reporting format for any
  474. error messages that might be produced by NASM.
  475. Currently, two error reporting formats may be selected. They are
  476. the \c{-Xvc} option and the \c{-Xgnu} option. The GNU format is
  477. the default and looks like this:
  478. \c filename.asm:65: error: specific error message
  479. where \c{filename.asm} is the name of the source file in which the
  480. error was detected, \c{65} is the source file line number on which
  481. the error was detected, \c{error} is the severity of the error (this
  482. could be \c{warning}), and \c{specific error message} is a more
  483. detailed text message which should help pinpoint the exact problem.
  484. The other format, specified by \c{-Xvc} is the style used by Microsoft
  485. Visual C++ and some other programs. It looks like this:
  486. \c filename.asm(65) : error: specific error message
  487. where the only difference is that the line number is in parentheses
  488. instead of being delimited by colons.
  489. See also the \c{Visual C++} output format, \k{win32fmt}.
  490. \S{opt-Z} The \i\c{-Z} Option: Send Errors to a File
  491. Under \I{DOS}\c{MS-DOS} it can be difficult (though there are ways) to
  492. redirect the standard-error output of a program to a file. Since
  493. NASM usually produces its warning and \i{error messages} on
  494. \i\c{stderr}, this can make it hard to capture the errors if (for
  495. example) you want to load them into an editor.
  496. NASM therefore provides the \c{-Z} option, taking a filename argument
  497. which causes errors to be sent to the specified files rather than
  498. standard error. Therefore you can \I{redirecting errors}redirect
  499. the errors into a file by typing
  500. \c nasm -Z myfile.err -f obj myfile.asm
  501. In earlier versions of NASM, this option was called \c{-E}, but it was
  502. changed since \c{-E} is an option conventionally used for
  503. preprocessing only, with disastrous results. See \k{opt-E}.
  504. \S{opt-s} The \i\c{-s} Option: Send Errors to \i\c{stdout}
  505. The \c{-s} option redirects \i{error messages} to \c{stdout} rather
  506. than \c{stderr}, so it can be redirected under \I{DOS}\c{MS-DOS}. To
  507. assemble the file \c{myfile.asm} and pipe its output to the \c{more}
  508. program, you can type:
  509. \c nasm -s -f obj myfile.asm | more
  510. See also the \c{-Z} option, \k{opt-Z}.
  511. \S{opt-i} The \i\c{-i}\I\c{-I} Option: Include File Search Directories
  512. When NASM sees the \i\c{%include} or \i\c{%pathsearch} directive in a
  513. source file (see \k{include}, \k{pathsearch} or \k{incbin}), it will
  514. search for the given file not only in the current directory, but also
  515. in any directories specified on the command line by the use of the
  516. \c{-i} option. Therefore you can include files from a \i{macro
  517. library}, for example, by typing
  518. \c nasm -ic:\macrolib\ -f obj myfile.asm
  519. (As usual, a space between \c{-i} and the path name is allowed, and
  520. optional).
  521. Prior NASM 2.14 a path provided in the option has been considered as
  522. a verbatim copy and providing a path separator been up to a caller.
  523. One could implicitly concatenate a search path together with a filename.
  524. Still this was rather a trick than something useful. Now the trailing
  525. path separator is made to always present, thus \c{-ifoo} will be
  526. considered as the \c{-ifoo/} directory.
  527. If you want to define a \e{standard} \i{include search path},
  528. similar to \c{/usr/include} on Unix systems, you should place one or
  529. more \c{-i} directives in the \c{NASMENV} environment variable (see
  530. \k{nasmenv}).
  531. For Makefile compatibility with many C compilers, this option can also
  532. be specified as \c{-I}.
  533. \S{opt-p} The \i\c{-p}\I\c{-P} Option: \I{pre-including files}Pre-Include a File
  534. \I\c{%include}NASM allows you to specify files to be
  535. \e{pre-included} into your source file, by the use of the \c{-p}
  536. option. So running
  537. \c nasm myfile.asm -p myinc.inc
  538. is equivalent to running \c{nasm myfile.asm} and placing the
  539. directive \c{%include "myinc.inc"} at the start of the file.
  540. \c{--include} option is also accepted.
  541. For consistency with the \c{-I}, \c{-D} and \c{-U} options, this
  542. option can also be specified as \c{-P}.
  543. \S{opt-d} The \i\c{-d}\I\c{-D} Option: \I{pre-defining macros}Pre-Define a Macro
  544. \I\c{%define}Just as the \c{-p} option gives an alternative to placing
  545. \c{%include} directives at the start of a source file, the \c{-d}
  546. option gives an alternative to placing a \c{%define} directive. You
  547. could code
  548. \c nasm myfile.asm -dFOO=100
  549. as an alternative to placing the directive
  550. \c %define FOO 100
  551. at the start of the file. You can miss off the macro value, as well:
  552. the option \c{-dFOO} is equivalent to coding \c{%define FOO}. This
  553. form of the directive may be useful for selecting \i{assembly-time
  554. options} which are then tested using \c{%ifdef}, for example
  555. \c{-dDEBUG}.
  556. For Makefile compatibility with many C compilers, this option can also
  557. be specified as \c{-D}.
  558. \S{opt-u} The \i\c{-u}\I\c{-U} Option: \I{Undefining macros}Undefine a Macro
  559. \I\c{%undef}The \c{-u} option undefines a macro that would otherwise
  560. have been pre-defined, either automatically or by a \c{-p} or \c{-d}
  561. option specified earlier on the command lines.
  562. For example, the following command line:
  563. \c nasm myfile.asm -dFOO=100 -uFOO
  564. would result in \c{FOO} \e{not} being a predefined macro in the
  565. program. This is useful to override options specified at a different
  566. point in a Makefile.
  567. For Makefile compatibility with many C compilers, this option can also
  568. be specified as \c{-U}.
  569. \S{opt-E} The \i\c{-E}\I{-e} Option: Preprocess Only
  570. NASM allows the \i{preprocessor} to be run on its own, up to a
  571. point. Using the \c{-E} option (which requires no arguments) will
  572. cause NASM to preprocess its input file, expand all the macro
  573. references, remove all the comments and preprocessor directives, and
  574. print the resulting file on standard output (or save it to a file,
  575. if the \c{-o} option is also used).
  576. This option cannot be applied to programs which require the
  577. preprocessor to evaluate \I{preprocessor expressions}\i{expressions}
  578. which depend on the values of symbols: so code such as
  579. \c %assign tablesize ($-tablestart)
  580. will cause an error in \i{preprocess-only mode}.
  581. For compatiblity with older version of NASM, this option can also be
  582. written \c{-e}. \c{-E} in older versions of NASM was the equivalent
  583. of the current \c{-Z} option, \k{opt-Z}.
  584. \S{opt-a} The \i\c{-a} Option: Don't Preprocess At All
  585. If NASM is being used as the back end to a compiler, it might be
  586. desirable to \I{suppressing preprocessing}suppress preprocessing
  587. completely and assume the compiler has already done it, to save time
  588. and increase compilation speeds. The \c{-a} option, requiring no
  589. argument, instructs NASM to replace its powerful \i{preprocessor}
  590. with a \i{stub preprocessor} which does nothing.
  591. \S{opt-O} The \i\c{-O} Option: Specifying \i{Multipass Optimization}
  592. Using the \c{-O} option, you can tell NASM to carry out different
  593. levels of optimization. Multiple flags can be specified after the
  594. \c{-O} options, some of which can be combined in a single option,
  595. e.g. \c{-Oxv}.
  596. \b \c{-O0}: No optimization. All operands take their long forms,
  597. if a short form is not specified, except conditional jumps.
  598. This is intended to match NASM 0.98 behavior.
  599. \b \c{-O1}: Minimal optimization. As above, but immediate operands
  600. which will fit in a signed byte are optimized,
  601. unless the long form is specified. Conditional jumps default
  602. to the long form unless otherwise specified.
  603. \b \c{-Ox} (where \c{x} is the actual letter \c{x}): Multipass optimization.
  604. Minimize branch offsets and signed immediate bytes,
  605. overriding size specification unless the \c{strict} keyword
  606. has been used (see \k{strict}). For compatibility with earlier
  607. releases, the letter \c{x} may also be any number greater than
  608. one. This number has no effect on the actual number of passes.
  609. \b \c{-Ov}: At the end of assembly, print the number of passes
  610. actually executed.
  611. The \c{-Ox} mode is recommended for most uses, and is the default
  612. since NASM 2.09.
  613. Note that this is a capital \c{O}, and is different from a small \c{o}, which
  614. is used to specify the output file name. See \k{opt-o}.
  615. \S{opt-t} The \i\c{-t} Option: Enable TASM Compatibility Mode
  616. NASM includes a limited form of compatibility with Borland's \i\c{TASM}.
  617. When NASM's \c{-t} option is used, the following changes are made:
  618. \b local labels may be prefixed with \c{@@} instead of \c{.}
  619. \b size override is supported within brackets. In TASM compatible mode,
  620. a size override inside square brackets changes the size of the operand,
  621. and not the address type of the operand as it does in NASM syntax. E.g.
  622. \c{mov eax,[DWORD val]} is valid syntax in TASM compatibility mode.
  623. Note that you lose the ability to override the default address type for
  624. the instruction.
  625. \b unprefixed forms of some directives supported (\c{arg}, \c{elif},
  626. \c{else}, \c{endif}, \c{if}, \c{ifdef}, \c{ifdifi}, \c{ifndef},
  627. \c{include}, \c{local})
  628. \S{opt-w} The \i\c{-w} and \i\c{-W} Options: Enable or Disable Assembly \i{Warnings}
  629. NASM can observe many conditions during the course of assembly which
  630. are worth mentioning to the user, but not a sufficiently severe
  631. error to justify NASM refusing to generate an output file. These
  632. conditions are reported like errors, but come up with the word
  633. `warning' before the message. Warnings do not prevent NASM from
  634. generating an output file and returning a success status to the
  635. operating system.
  636. Some conditions are even less severe than that: they are only
  637. sometimes worth mentioning to the user. Therefore NASM supports the
  638. \c{-w} command-line option, which enables or disables certain
  639. classes of assembly warning. Such warning classes are described by a
  640. name, for example \c{orphan-labels}; you can enable warnings of
  641. this class by the command-line option \c{-w+orphan-labels} and
  642. disable it by \c{-w-orphan-labels}.
  643. The current \i{warning classes} are:
  644. \b \i\c{other} specifies any warning not otherwise specified in any
  645. class. Enabled by default.
  646. \b \i\c{macro-params} covers warnings about \i{multi-line macros}
  647. being invoked with the wrong number of parameters. Enabled by default;
  648. see \k{mlmacover} for an example of why you might want to disable it.
  649. \b \i\c{macro-selfref} warns if a macro references itself. Disabled by
  650. default.
  651. \b \i\c{macro-defaults} warns when a macro has more default parameters
  652. than optional parameters. Enabled by default; see \k{mlmacdef} for why
  653. you might want to disable it.
  654. \b \i\c{orphan-labels} covers warnings about source lines which
  655. contain no instruction but define a label without a trailing colon.
  656. NASM warns about this somewhat obscure condition by default;
  657. see \k{syntax} for more information.
  658. \b \i\c{number-overflow} covers warnings about numeric constants which
  659. don't fit in 64 bits. Enabled by default.
  660. \b \i\c{gnu-elf-extensions} warns if 8-bit or 16-bit relocations
  661. are used in \c{-f elf} format. The GNU extensions allow this.
  662. Disabled by default.
  663. \b \i\c{float-overflow} warns about floating point overflow.
  664. Enabled by default.
  665. \b \i\c{float-denorm} warns about floating point denormals.
  666. Disabled by default.
  667. \b \i\c{float-underflow} warns about floating point underflow.
  668. Disabled by default.
  669. \b \i\c{float-toolong} warns about too many digits in floating-point numbers.
  670. Enabled by default.
  671. \b \i\c{user} controls \c{%warning} directives (see \k{pperror}).
  672. Enabled by default.
  673. \b \i\c{lock} warns about \c{LOCK} prefixes on unlockable instructions.
  674. Enabled by default.
  675. \b \i\c{hle} warns about invalid use of the HLE \c{XACQUIRE} or \c{XRELEASE}
  676. prefixes.
  677. Enabled by default.
  678. \b \i\c{bnd} warns about ineffective use of the \c{BND} prefix when a relaxed
  679. form of jmp instruction becomes jmp short form.
  680. Enabled by default.
  681. \b \i\c{zext-reloc} warns that a relocation has been zero-extended due
  682. to limitations in the output format. Enabled by default.
  683. \b \i\c\{ptr} warns about keywords used in other assemblers that might
  684. indicate a mistake in the source code. Currently only the MASM
  685. \c{PTR} keyword is recognized. Enabled by default.
  686. \b \i\c{bad-pragma} warns about a malformed or otherwise unparsable
  687. \c{%pragma} directive. Disabled by default.
  688. \b \i\c{unknown-pragma} warns about an unknown \c{%pragma} directive.
  689. This is not yet implemented. Disabled by default.
  690. \b \i\c{not-my-pragma} warns about a \c{%pragma} directive which is
  691. not applicable to this particular assembly session. This is not yet
  692. implemented. Disabled by default.
  693. \b \i\c{unknown-warning} warns about a \c{-w} or \c{-W} option or a
  694. \c{[WARNING]} directive that contains an unknown warning name or is
  695. otherwise not possible to process. Disabled by default.
  696. \b \i\c{all} is an alias for \e{all} suppressible warning classes.
  697. Thus, \c{-w+all} enables all available warnings, and \c{-w-all}
  698. disables warnings entirely (since NASM 2.13).
  699. Since version 2.00, NASM has also supported the \c{gcc}-like syntax
  700. \c{-Wwarning-class} and \c{-Wno-warning-class} instead of
  701. \c{-w+warning-class} and \c{-w-warning-class}, respectively; both
  702. syntaxes work identically.
  703. The option \c{-w+error} or \i\c{-Werror} can be used to treat warnings
  704. as errors. This can be controlled on a per warning class basis
  705. (\c{-w+error=}\e{warning-class} or \c{-Werror=}\e{warning-class});
  706. if no \e{warning-class} is specified NASM treats it as
  707. \c{-w+error=all}; the same applies to \c{-w-error} or
  708. \i\c{-Wno-error},
  709. of course.
  710. In addition, you can control warnings in the source code itself, using
  711. the \i\c{[WARNING]} directive. See \k{asmdir-warning}.
  712. \S{opt-v} The \i\c{-v} Option: Display \i{Version} Info
  713. Typing \c{NASM -v} will display the version of NASM which you are using,
  714. and the date on which it was compiled.
  715. You will need the version number if you report a bug.
  716. For command-line compatibility with Yasm, the form \i\c{--v} is also
  717. accepted for this option starting in NASM version 2.11.05.
  718. \S{opt-y} The \i\c{-y} Option: Display Available Debug Info Formats
  719. Typing \c{nasm -f <option> -y} will display a list of the available
  720. debug info formats for the given output format. The default format
  721. is indicated by an asterisk. For example:
  722. \c nasm -f elf -y
  723. \c valid debug formats for 'elf32' output format are
  724. \c ('*' denotes default):
  725. \c * stabs ELF32 (i386) stabs debug format for Linux
  726. \c dwarf elf32 (i386) dwarf debug format for Linux
  727. \S{opt-pfix} The \i\c{--(g|l)prefix}, \i\c{--(g|l)postfix} Options.
  728. The \c{--(g)prefix} options prepend the given argument
  729. to all \c{extern}, \c{common}, \c{static}, and \c{global} symbols, and the
  730. \c{--lprefix} option prepends to all other symbols. Similarly,
  731. \c{--(g)postfix} and \c{--lpostfix} options append
  732. the argument in the exactly same way as the \c{--xxprefix} options does.
  733. Running this:
  734. \c nasm -f macho --gprefix _
  735. is equivalent to place the directive with \c{%pragma macho gprefix _}
  736. at the start of the file (\k{mangling}). It will prepend the underscore
  737. to all global and external variables, as C requires it in some, but not all,
  738. system calling conventions.
  739. \S{opt-pragma} The \i\c{--pragma} Option
  740. NASM accepts an argument as \c{%pragma} option, which is like placing
  741. a \c{%pragma} preprocess statement at the beginning of the source.
  742. Running this:
  743. \c nasm -f macho --pragma "macho gprefix _"
  744. is equivalent to the example in \k{opt-pfix}.
  745. \S{opt-before} The \i\c{--before} Option
  746. A preprocess statement can be accepted with this option. The example
  747. shown in \k{opt-pragma} is the same as running this:
  748. \c nasm -f macho --before "%pragma macho gprefix _"
  749. \S{opt-limit} The \i\c{--limit-X} Option
  750. This option allows user to setup various maximum values for these:
  751. \b\c{--limit-passes}: Number of maximum allowed passes. Default is
  752. effectively unlimited.
  753. \b\c{--limit-stalled-passes}: Maximum number of allowed unfinished
  754. passes. Default is 1000.
  755. \b\c{--limit-macro-levels}: Define maximum depth of macro expansion
  756. (in preprocess). Default is 1000000.
  757. \b\c{--limit-rep}: Maximum number of allowed preprocessor loop, defined
  758. under \c{%rep}. Default is 1000000.
  759. \b\c{--limit-eval}: This number sets the boundary condition of allowed
  760. expression length. Default is 1000000.
  761. \b\c{--limit-lines}: Total number of source lines as allowed to be
  762. processed. Default is 2000000000.
  763. In example, running this limits the maximum line count to be 1000.
  764. \c nasm --limit-lines 1000
  765. \S{opt-keep-all} The \i\c{--keep-all} Option
  766. This option prevents NASM from deleting any output files even if an
  767. error happens.
  768. \S{opt-no-line} The \i\c{--no-line} Option
  769. If this option is given, all \i\c{%line} directives in the source code
  770. are ignored. This can be useful for debugging already preprocessed
  771. code. See \k{line}.
  772. \S{nasmenv} The \i\c{NASMENV} \i{Environment} Variable
  773. If you define an environment variable called \c{NASMENV}, the program
  774. will interpret it as a list of extra command-line options, which are
  775. processed before the real command line. You can use this to define
  776. standard search directories for include files, by putting \c{-i}
  777. options in the \c{NASMENV} variable.
  778. The value of the variable is split up at white space, so that the
  779. value \c{-s -ic:\\nasmlib\\} will be treated as two separate options.
  780. However, that means that the value \c{-dNAME="my name"} won't do
  781. what you might want, because it will be split at the space and the
  782. NASM command-line processing will get confused by the two
  783. nonsensical words \c{-dNAME="my} and \c{name"}.
  784. To get round this, NASM provides a feature whereby, if you begin the
  785. \c{NASMENV} environment variable with some character that isn't a minus
  786. sign, then NASM will treat this character as the \i{separator
  787. character} for options. So setting the \c{NASMENV} variable to the
  788. value \c{!-s!-ic:\\nasmlib\\} is equivalent to setting it to \c{-s
  789. -ic:\\nasmlib\\}, but \c{!-dNAME="my name"} will work.
  790. This environment variable was previously called \c{NASM}. This was
  791. changed with version 0.98.31.
  792. \H{qstart} \i{Quick Start} for \i{MASM} Users
  793. If you're used to writing programs with MASM, or with \i{TASM} in
  794. MASM-compatible (non-Ideal) mode, or with \i\c{a86}, this section
  795. attempts to outline the major differences between MASM's syntax and
  796. NASM's. If you're not already used to MASM, it's probably worth
  797. skipping this section.
  798. \S{qscs} NASM Is \I{case sensitivity}Case-Sensitive
  799. One simple difference is that NASM is case-sensitive. It makes a
  800. difference whether you call your label \c{foo}, \c{Foo} or \c{FOO}.
  801. If you're assembling to \c{DOS} or \c{OS/2} \c{.OBJ} files, you can
  802. invoke the \i\c{UPPERCASE} directive (documented in \k{objfmt}) to
  803. ensure that all symbols exported to other code modules are forced
  804. to be upper case; but even then, \e{within} a single module, NASM
  805. will distinguish between labels differing only in case.
  806. \S{qsbrackets} NASM Requires \i{Square Brackets} For \i{Memory References}
  807. NASM was designed with simplicity of syntax in mind. One of the
  808. \i{design goals} of NASM is that it should be possible, as far as is
  809. practical, for the user to look at a single line of NASM code
  810. and tell what opcode is generated by it. You can't do this in MASM:
  811. if you declare, for example,
  812. \c foo equ 1
  813. \c bar dw 2
  814. then the two lines of code
  815. \c mov ax,foo
  816. \c mov ax,bar
  817. generate completely different opcodes, despite having
  818. identical-looking syntaxes.
  819. NASM avoids this undesirable situation by having a much simpler
  820. syntax for memory references. The rule is simply that any access to
  821. the \e{contents} of a memory location requires square brackets
  822. around the address, and any access to the \e{address} of a variable
  823. doesn't. So an instruction of the form \c{mov ax,foo} will
  824. \e{always} refer to a compile-time constant, whether it's an \c{EQU}
  825. or the address of a variable; and to access the \e{contents} of the
  826. variable \c{bar}, you must code \c{mov ax,[bar]}.
  827. This also means that NASM has no need for MASM's \i\c{OFFSET}
  828. keyword, since the MASM code \c{mov ax,offset bar} means exactly the
  829. same thing as NASM's \c{mov ax,bar}. If you're trying to get
  830. large amounts of MASM code to assemble sensibly under NASM, you
  831. can always code \c{%idefine offset} to make the preprocessor treat
  832. the \c{OFFSET} keyword as a no-op.
  833. This issue is even more confusing in \i\c{a86}, where declaring a
  834. label with a trailing colon defines it to be a `label' as opposed to
  835. a `variable' and causes \c{a86} to adopt NASM-style semantics; so in
  836. \c{a86}, \c{mov ax,var} has different behaviour depending on whether
  837. \c{var} was declared as \c{var: dw 0} (a label) or \c{var dw 0} (a
  838. word-size variable). NASM is very simple by comparison:
  839. \e{everything} is a label.
  840. NASM, in the interests of simplicity, also does not support the
  841. \i{hybrid syntaxes} supported by MASM and its clones, such as
  842. \c{mov ax,table[bx]}, where a memory reference is denoted by one
  843. portion outside square brackets and another portion inside. The
  844. correct syntax for the above is \c{mov ax,[table+bx]}. Likewise,
  845. \c{mov ax,es:[di]} is wrong and \c{mov ax,[es:di]} is right.
  846. \S{qstypes} NASM Doesn't Store \i{Variable Types}
  847. NASM, by design, chooses not to remember the types of variables you
  848. declare. Whereas MASM will remember, on seeing \c{var dw 0}, that
  849. you declared \c{var} as a word-size variable, and will then be able
  850. to fill in the \i{ambiguity} in the size of the instruction \c{mov
  851. var,2}, NASM will deliberately remember nothing about the symbol
  852. \c{var} except where it begins, and so you must explicitly code
  853. \c{mov word [var],2}.
  854. For this reason, NASM doesn't support the \c{LODS}, \c{MOVS},
  855. \c{STOS}, \c{SCAS}, \c{CMPS}, \c{INS}, or \c{OUTS} instructions,
  856. but only supports the forms such as \c{LODSB}, \c{MOVSW}, and
  857. \c{SCASD}, which explicitly specify the size of the components of
  858. the strings being manipulated.
  859. \S{qsassume} NASM Doesn't \i\c{ASSUME}
  860. As part of NASM's drive for simplicity, it also does not support the
  861. \c{ASSUME} directive. NASM will not keep track of what values you
  862. choose to put in your segment registers, and will never
  863. \e{automatically} generate a \i{segment override} prefix.
  864. \S{qsmodel} NASM Doesn't Support \i{Memory Models}
  865. NASM also does not have any directives to support different 16-bit
  866. memory models. The programmer has to keep track of which functions
  867. are supposed to be called with a \i{far call} and which with a
  868. \i{near call}, and is responsible for putting the correct form of
  869. \c{RET} instruction (\c{RETN} or \c{RETF}; NASM accepts \c{RET}
  870. itself as an alternate form for \c{RETN}); in addition, the
  871. programmer is responsible for coding CALL FAR instructions where
  872. necessary when calling \e{external} functions, and must also keep
  873. track of which external variable definitions are far and which are
  874. near.
  875. \S{qsfpu} \i{Floating-Point} Differences
  876. NASM uses different names to refer to floating-point registers from
  877. MASM: where MASM would call them \c{ST(0)}, \c{ST(1)} and so on, and
  878. \i\c{a86} would call them simply \c{0}, \c{1} and so on, NASM
  879. chooses to call them \c{st0}, \c{st1} etc.
  880. As of version 0.96, NASM now treats the instructions with
  881. \i{`nowait'} forms in the same way as MASM-compatible assemblers.
  882. The idiosyncratic treatment employed by 0.95 and earlier was based
  883. on a misunderstanding by the authors.
  884. \S{qsother} Other Differences
  885. For historical reasons, NASM uses the keyword \i\c{TWORD} where MASM
  886. and compatible assemblers use \i\c{TBYTE}.
  887. NASM does not declare \i{uninitialized storage} in the same way as
  888. MASM: where a MASM programmer might use \c{stack db 64 dup (?)},
  889. NASM requires \c{stack resb 64}, intended to be read as `reserve 64
  890. bytes'. For a limited amount of compatibility, since NASM treats
  891. \c{?} as a valid character in symbol names, you can code \c{? equ 0}
  892. and then writing \c{dw ?} will at least do something vaguely useful.
  893. \I\c{RESB}\i\c{DUP} is still not a supported syntax, however.
  894. In addition to all of this, macros and directives work completely
  895. differently to MASM. See \k{preproc} and \k{directive} for further
  896. details.
  897. \C{lang} The NASM Language
  898. \H{syntax} Layout of a NASM Source Line
  899. Like most assemblers, each NASM source line contains (unless it
  900. is a macro, a preprocessor directive or an assembler directive: see
  901. \k{preproc} and \k{directive}) some combination of the four fields
  902. \c label: instruction operands ; comment
  903. As usual, most of these fields are optional; the presence or absence
  904. of any combination of a label, an instruction and a comment is allowed.
  905. Of course, the operand field is either required or forbidden by the
  906. presence and nature of the instruction field.
  907. NASM uses backslash (\\) as the line continuation character; if a line
  908. ends with backslash, the next line is considered to be a part of the
  909. backslash-ended line.
  910. NASM places no restrictions on white space within a line: labels may
  911. have white space before them, or instructions may have no space
  912. before them, or anything. The \i{colon} after a label is also
  913. optional. (Note that this means that if you intend to code \c{lodsb}
  914. alone on a line, and type \c{lodab} by accident, then that's still a
  915. valid source line which does nothing but define a label. Running
  916. NASM with the command-line option
  917. \I{orphan-labels}\c{-w+orphan-labels} will cause it to warn you if
  918. you define a label alone on a line without a \i{trailing colon}.)
  919. \i{Valid characters} in labels are letters, numbers, \c{_}, \c{$},
  920. \c{#}, \c{@}, \c{~}, \c{.}, and \c{?}. The only characters which may
  921. be used as the \e{first} character of an identifier are letters,
  922. \c{.} (with special meaning: see \k{locallab}), \c{_} and \c{?}.
  923. An identifier may also be prefixed with a \I{$, prefix}\c{$} to
  924. indicate that it is intended to be read as an identifier and not a
  925. reserved word; thus, if some other module you are linking with
  926. defines a symbol called \c{eax}, you can refer to \c{$eax} in NASM
  927. code to distinguish the symbol from the register. Maximum length of
  928. an identifier is 4095 characters.
  929. The instruction field may contain any machine instruction: Pentium
  930. and P6 instructions, FPU instructions, MMX instructions and even
  931. undocumented instructions are all supported. The instruction may be
  932. prefixed by \c{LOCK}, \c{REP}, \c{REPE}/\c{REPZ}, \c{REPNE}/\c{REPNZ},
  933. \c{XACQUIRE}/\c{XRELEASE} or \c{BND}/\c{NOBND}, in the usual way. Explicit
  934. \I{address-size prefixes}address-size and \i{operand-size prefixes} \i\c{A16},
  935. \i\c{A32}, \i\c{A64}, \i\c{O16} and \i\c{O32}, \i\c{O64} are provided - one example of their use
  936. is given in \k{mixsize}. You can also use the name of a \I{segment
  937. override}segment register as an instruction prefix: coding
  938. \c{es mov [bx],ax} is equivalent to coding \c{mov [es:bx],ax}. We
  939. recommend the latter syntax, since it is consistent with other
  940. syntactic features of the language, but for instructions such as
  941. \c{LODSB}, which has no operands and yet can require a segment
  942. override, there is no clean syntactic way to proceed apart from
  943. \c{es lodsb}.
  944. An instruction is not required to use a prefix: prefixes such as
  945. \c{CS}, \c{A32}, \c{LOCK} or \c{REPE} can appear on a line by
  946. themselves, and NASM will just generate the prefix bytes.
  947. In addition to actual machine instructions, NASM also supports a
  948. number of pseudo-instructions, described in \k{pseudop}.
  949. Instruction \i{operands} may take a number of forms: they can be
  950. registers, described simply by the register name (e.g. \c{ax},
  951. \c{bp}, \c{ebx}, \c{cr0}: NASM does not use the \c{gas}-style
  952. syntax in which register names must be prefixed by a \c{%} sign), or
  953. they can be \i{effective addresses} (see \k{effaddr}), constants
  954. (\k{const}) or expressions (\k{expr}).
  955. For x87 \i{floating-point} instructions, NASM accepts a wide range of
  956. syntaxes: you can use two-operand forms like MASM supports, or you
  957. can use NASM's native single-operand forms in most cases.
  958. \# Details of
  959. \# all forms of each supported instruction are given in
  960. \# \k{iref}.
  961. For example, you can code:
  962. \c fadd st1 ; this sets st0 := st0 + st1
  963. \c fadd st0,st1 ; so does this
  964. \c
  965. \c fadd st1,st0 ; this sets st1 := st1 + st0
  966. \c fadd to st1 ; so does this
  967. Almost any x87 floating-point instruction that references memory must
  968. use one of the prefixes \i\c{DWORD}, \i\c{QWORD} or \i\c{TWORD} to
  969. indicate what size of \i{memory operand} it refers to.
  970. \H{pseudop} \i{Pseudo-Instructions}
  971. Pseudo-instructions are things which, though not real x86 machine
  972. instructions, are used in the instruction field anyway because that's
  973. the most convenient place to put them. The current pseudo-instructions
  974. are \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, \i\c{DO},
  975. \i\c{DY} and \i\c\{DZ}; their \i{uninitialized} counterparts
  976. \i\c{RESB}, \i\c{RESW}, \i\c{RESD}, \i\c{RESQ}, \i\c{REST},
  977. \i\c{RESO}, \i\c{RESY} and \i\c\{RESZ}; the \i\c{INCBIN} command, the
  978. \i\c{EQU} command, and the \i\c{TIMES} prefix.
  979. \S{db} \c{DB} and Friends: Declaring Initialized Data
  980. \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, \i\c{DO}, \i\c{DY}
  981. and \i\c{DZ} are used, much as in MASM, to declare initialized data in
  982. the output file. They can be invoked in a wide range of ways:
  983. \I{floating-point}\I{character constant}\I{string constant}
  984. \c db 0x55 ; just the byte 0x55
  985. \c db 0x55,0x56,0x57 ; three bytes in succession
  986. \c db 'a',0x55 ; character constants are OK
  987. \c db 'hello',13,10,'$' ; so are string constants
  988. \c dw 0x1234 ; 0x34 0x12
  989. \c dw 'a' ; 0x61 0x00 (it's just a number)
  990. \c dw 'ab' ; 0x61 0x62 (character constant)
  991. \c dw 'abc' ; 0x61 0x62 0x63 0x00 (string)
  992. \c dd 0x12345678 ; 0x78 0x56 0x34 0x12
  993. \c dd 1.234567e20 ; floating-point constant
  994. \c dq 0x123456789abcdef0 ; eight byte constant
  995. \c dq 1.234567e20 ; double-precision float
  996. \c dt 1.234567e20 ; extended-precision float
  997. \c{DT}, \c{DO}, \c{DY} and \c{DZ} do not accept \i{numeric constants}
  998. as operands.
  999. \S{resb} \c{RESB} and Friends: Declaring \i{Uninitialized} Data
  1000. \i\c{RESB}, \i\c{RESW}, \i\c{RESD}, \i\c{RESQ}, \i\c{REST},
  1001. \i\c{RESO}, \i\c{RESY} and \i\c\{RESZ} are designed to be used in the
  1002. BSS section of a module: they declare \e{uninitialized} storage
  1003. space. Each takes a single operand, which is the number of bytes,
  1004. words, doublewords or whatever to reserve. As stated in \k{qsother},
  1005. NASM does not support the MASM/TASM syntax of reserving uninitialized
  1006. space by writing \I\c{?}\c{DW ?} or similar things: this is what it
  1007. does instead. The operand to a \c{RESB}-type pseudo-instruction is a
  1008. \i\e{critical expression}: see \k{crit}.
  1009. For example:
  1010. \c buffer: resb 64 ; reserve 64 bytes
  1011. \c wordvar: resw 1 ; reserve a word
  1012. \c realarray resq 10 ; array of ten reals
  1013. \c ymmval: resy 1 ; one YMM register
  1014. \c zmmvals: resz 32 ; 32 ZMM registers
  1015. \S{incbin} \i\c{INCBIN}: Including External \i{Binary Files}
  1016. \c{INCBIN} is borrowed from the old Amiga assembler \i{DevPac}: it
  1017. includes a binary file verbatim into the output file. This can be
  1018. handy for (for example) including \i{graphics} and \i{sound} data
  1019. directly into a game executable file. It can be called in one of
  1020. these three ways:
  1021. \c incbin "file.dat" ; include the whole file
  1022. \c incbin "file.dat",1024 ; skip the first 1024 bytes
  1023. \c incbin "file.dat",1024,512 ; skip the first 1024, and
  1024. \c ; actually include at most 512
  1025. \c{INCBIN} is both a directive and a standard macro; the standard
  1026. macro version searches for the file in the include file search path
  1027. and adds the file to the dependency lists. This macro can be
  1028. overridden if desired.
  1029. \S{equ} \i\c{EQU}: Defining Constants
  1030. \c{EQU} defines a symbol to a given constant value: when \c{EQU} is
  1031. used, the source line must contain a label. The action of \c{EQU} is
  1032. to define the given label name to the value of its (only) operand.
  1033. This definition is absolute, and cannot change later. So, for
  1034. example,
  1035. \c message db 'hello, world'
  1036. \c msglen equ $-message
  1037. defines \c{msglen} to be the constant 12. \c{msglen} may not then be
  1038. redefined later. This is not a \i{preprocessor} definition either:
  1039. the value of \c{msglen} is evaluated \e{once}, using the value of
  1040. \c{$} (see \k{expr} for an explanation of \c{$}) at the point of
  1041. definition, rather than being evaluated wherever it is referenced
  1042. and using the value of \c{$} at the point of reference.
  1043. \S{times} \i\c{TIMES}: \i{Repeating} Instructions or Data
  1044. The \c{TIMES} prefix causes the instruction to be assembled multiple
  1045. times. This is partly present as NASM's equivalent of the \i\c{DUP}
  1046. syntax supported by \i{MASM}-compatible assemblers, in that you can
  1047. code
  1048. \c zerobuf: times 64 db 0
  1049. or similar things; but \c{TIMES} is more versatile than that. The
  1050. argument to \c{TIMES} is not just a numeric constant, but a numeric
  1051. \e{expression}, so you can do things like
  1052. \c buffer: db 'hello, world'
  1053. \c times 64-$+buffer db ' '
  1054. which will store exactly enough spaces to make the total length of
  1055. \c{buffer} up to 64. Finally, \c{TIMES} can be applied to ordinary
  1056. instructions, so you can code trivial \i{unrolled loops} in it:
  1057. \c times 100 movsb
  1058. Note that there is no effective difference between \c{times 100 resb
  1059. 1} and \c{resb 100}, except that the latter will be assembled about
  1060. 100 times faster due to the internal structure of the assembler.
  1061. The operand to \c{TIMES} is a critical expression (\k{crit}).
  1062. Note also that \c{TIMES} can't be applied to \i{macros}: the reason
  1063. for this is that \c{TIMES} is processed after the macro phase, which
  1064. allows the argument to \c{TIMES} to contain expressions such as
  1065. \c{64-$+buffer} as above. To repeat more than one line of code, or a
  1066. complex macro, use the preprocessor \i\c{%rep} directive.
  1067. \H{effaddr} Effective Addresses
  1068. An \i{effective address} is any operand to an instruction which
  1069. \I{memory reference}references memory. Effective addresses, in NASM,
  1070. have a very simple syntax: they consist of an expression evaluating
  1071. to the desired address, enclosed in \i{square brackets}. For
  1072. example:
  1073. \c wordvar dw 123
  1074. \c mov ax,[wordvar]
  1075. \c mov ax,[wordvar+1]
  1076. \c mov ax,[es:wordvar+bx]
  1077. Anything not conforming to this simple system is not a valid memory
  1078. reference in NASM, for example \c{es:wordvar[bx]}.
  1079. More complicated effective addresses, such as those involving more
  1080. than one register, work in exactly the same way:
  1081. \c mov eax,[ebx*2+ecx+offset]
  1082. \c mov ax,[bp+di+8]
  1083. NASM is capable of doing \i{algebra} on these effective addresses,
  1084. so that things which don't necessarily \e{look} legal are perfectly
  1085. all right:
  1086. \c mov eax,[ebx*5] ; assembles as [ebx*4+ebx]
  1087. \c mov eax,[label1*2-label2] ; ie [label1+(label1-label2)]
  1088. Some forms of effective address have more than one assembled form;
  1089. in most such cases NASM will generate the smallest form it can. For
  1090. example, there are distinct assembled forms for the 32-bit effective
  1091. addresses \c{[eax*2+0]} and \c{[eax+eax]}, and NASM will generally
  1092. generate the latter on the grounds that the former requires four
  1093. bytes to store a zero offset.
  1094. NASM has a hinting mechanism which will cause \c{[eax+ebx]} and
  1095. \c{[ebx+eax]} to generate different opcodes; this is occasionally
  1096. useful because \c{[esi+ebp]} and \c{[ebp+esi]} have different
  1097. default segment registers.
  1098. However, you can force NASM to generate an effective address in a
  1099. particular form by the use of the keywords \c{BYTE}, \c{WORD},
  1100. \c{DWORD} and \c{NOSPLIT}. If you need \c{[eax+3]} to be assembled
  1101. using a double-word offset field instead of the one byte NASM will
  1102. normally generate, you can code \c{[dword eax+3]}. Similarly, you
  1103. can force NASM to use a byte offset for a small value which it
  1104. hasn't seen on the first pass (see \k{crit} for an example of such a
  1105. code fragment) by using \c{[byte eax+offset]}. As special cases,
  1106. \c{[byte eax]} will code \c{[eax+0]} with a byte offset of zero, and
  1107. \c{[dword eax]} will code it with a double-word offset of zero. The
  1108. normal form, \c{[eax]}, will be coded with no offset field.
  1109. The form described in the previous paragraph is also useful if you
  1110. are trying to access data in a 32-bit segment from within 16 bit code.
  1111. For more information on this see the section on mixed-size addressing
  1112. (\k{mixaddr}). In particular, if you need to access data with a known
  1113. offset that is larger than will fit in a 16-bit value, if you don't
  1114. specify that it is a dword offset, nasm will cause the high word of
  1115. the offset to be lost.
  1116. Similarly, NASM will split \c{[eax*2]} into \c{[eax+eax]} because
  1117. that allows the offset field to be absent and space to be saved; in
  1118. fact, it will also split \c{[eax*2+offset]} into
  1119. \c{[eax+eax+offset]}. You can combat this behaviour by the use of
  1120. the \c{NOSPLIT} keyword: \c{[nosplit eax*2]} will force
  1121. \c{[eax*2+0]} to be generated literally. \c{[nosplit eax*1]} also has the
  1122. same effect. In another way, a split EA form \c{[0, eax*2]} can be used, too.
  1123. However, \c{NOSPLIT} in \c{[nosplit eax+eax]} will be ignored because user's
  1124. intention here is considered as \c{[eax+eax]}.
  1125. In 64-bit mode, NASM will by default generate absolute addresses. The
  1126. \i\c{REL} keyword makes it produce \c{RIP}-relative addresses. Since
  1127. this is frequently the normally desired behaviour, see the \c{DEFAULT}
  1128. directive (\k{default}). The keyword \i\c{ABS} overrides \i\c{REL}.
  1129. A new form of split effective addres syntax is also supported. This is
  1130. mainly intended for mib operands as used by MPX instructions, but can
  1131. be used for any memory reference. The basic concept of this form is
  1132. splitting base and index.
  1133. \c mov eax,[ebx+8,ecx*4] ; ebx=base, ecx=index, 4=scale, 8=disp
  1134. For mib operands, there are several ways of writing effective address depending
  1135. on the tools. NASM supports all currently possible ways of mib syntax:
  1136. \c ; bndstx
  1137. \c ; next 5 lines are parsed same
  1138. \c ; base=rax, index=rbx, scale=1, displacement=3
  1139. \c bndstx [rax+0x3,rbx], bnd0 ; NASM - split EA
  1140. \c bndstx [rbx*1+rax+0x3], bnd0 ; GAS - '*1' indecates an index reg
  1141. \c bndstx [rax+rbx+3], bnd0 ; GAS - without hints
  1142. \c bndstx [rax+0x3], bnd0, rbx ; ICC-1
  1143. \c bndstx [rax+0x3], rbx, bnd0 ; ICC-2
  1144. When broadcasting decorator is used, the opsize keyword should match
  1145. the size of each element.
  1146. \c VDIVPS zmm4, zmm5, dword [rbx]{1to16} ; single-precision float
  1147. \c VDIVPS zmm4, zmm5, zword [rbx] ; packed 512 bit memory
  1148. \H{const} \i{Constants}
  1149. NASM understands four different types of constant: numeric,
  1150. character, string and floating-point.
  1151. \S{numconst} \i{Numeric Constants}
  1152. A numeric constant is simply a number. NASM allows you to specify
  1153. numbers in a variety of number bases, in a variety of ways: you can
  1154. suffix \c{H} or \c{X}, \c{D} or \c{T}, \c{Q} or \c{O}, and \c{B} or
  1155. \c{Y} for \i{hexadecimal}, \i{decimal}, \i{octal} and \i{binary}
  1156. respectively, or you can prefix \c{0x}, for hexadecimal in the style
  1157. of C, or you can prefix \c{$} for hexadecimal in the style of Borland
  1158. Pascal or Motorola Assemblers. Note, though, that the \I{$,
  1159. prefix}\c{$} prefix does double duty as a prefix on identifiers (see
  1160. \k{syntax}), so a hex number prefixed with a \c{$} sign must have a
  1161. digit after the \c{$} rather than a letter. In addition, current
  1162. versions of NASM accept the prefix \c{0h} for hexadecimal, \c{0d} or
  1163. \c{0t} for decimal, \c{0o} or \c{0q} for octal, and \c{0b} or \c{0y}
  1164. for binary. Please note that unlike C, a \c{0} prefix by itself does
  1165. \e{not} imply an octal constant!
  1166. Numeric constants can have underscores (\c{_}) interspersed to break
  1167. up long strings.
  1168. Some examples (all producing exactly the same code):
  1169. \c mov ax,200 ; decimal
  1170. \c mov ax,0200 ; still decimal
  1171. \c mov ax,0200d ; explicitly decimal
  1172. \c mov ax,0d200 ; also decimal
  1173. \c mov ax,0c8h ; hex
  1174. \c mov ax,$0c8 ; hex again: the 0 is required
  1175. \c mov ax,0xc8 ; hex yet again
  1176. \c mov ax,0hc8 ; still hex
  1177. \c mov ax,310q ; octal
  1178. \c mov ax,310o ; octal again
  1179. \c mov ax,0o310 ; octal yet again
  1180. \c mov ax,0q310 ; octal yet again
  1181. \c mov ax,11001000b ; binary
  1182. \c mov ax,1100_1000b ; same binary constant
  1183. \c mov ax,1100_1000y ; same binary constant once more
  1184. \c mov ax,0b1100_1000 ; same binary constant yet again
  1185. \c mov ax,0y1100_1000 ; same binary constant yet again
  1186. \S{strings} \I{Strings}\i{Character Strings}
  1187. A character string consists of up to eight characters enclosed in
  1188. either single quotes (\c{'...'}), double quotes (\c{"..."}) or
  1189. backquotes (\c{`...`}). Single or double quotes are equivalent to
  1190. NASM (except of course that surrounding the constant with single
  1191. quotes allows double quotes to appear within it and vice versa); the
  1192. contents of those are represented verbatim. Strings enclosed in
  1193. backquotes support C-style \c{\\}-escapes for special characters.
  1194. The following \i{escape sequences} are recognized by backquoted strings:
  1195. \c \' single quote (')
  1196. \c \" double quote (")
  1197. \c \` backquote (`)
  1198. \c \\\ backslash (\)
  1199. \c \? question mark (?)
  1200. \c \a BEL (ASCII 7)
  1201. \c \b BS (ASCII 8)
  1202. \c \t TAB (ASCII 9)
  1203. \c \n LF (ASCII 10)
  1204. \c \v VT (ASCII 11)
  1205. \c \f FF (ASCII 12)
  1206. \c \r CR (ASCII 13)
  1207. \c \e ESC (ASCII 27)
  1208. \c \377 Up to 3 octal digits - literal byte
  1209. \c \xFF Up to 2 hexadecimal digits - literal byte
  1210. \c \u1234 4 hexadecimal digits - Unicode character
  1211. \c \U12345678 8 hexadecimal digits - Unicode character
  1212. All other escape sequences are reserved. Note that \c{\\0}, meaning a
  1213. \c{NUL} character (ASCII 0), is a special case of the octal escape
  1214. sequence.
  1215. \i{Unicode} characters specified with \c{\\u} or \c{\\U} are converted to
  1216. \i{UTF-8}. For example, the following lines are all equivalent:
  1217. \c db `\u263a` ; UTF-8 smiley face
  1218. \c db `\xe2\x98\xba` ; UTF-8 smiley face
  1219. \c db 0E2h, 098h, 0BAh ; UTF-8 smiley face
  1220. \S{chrconst} \i{Character Constants}
  1221. A character constant consists of a string up to eight bytes long, used
  1222. in an expression context. It is treated as if it was an integer.
  1223. A character constant with more than one byte will be arranged
  1224. with \i{little-endian} order in mind: if you code
  1225. \c mov eax,'abcd'
  1226. then the constant generated is not \c{0x61626364}, but
  1227. \c{0x64636261}, so that if you were then to store the value into
  1228. memory, it would read \c{abcd} rather than \c{dcba}. This is also
  1229. the sense of character constants understood by the Pentium's
  1230. \i\c{CPUID} instruction.
  1231. \S{strconst} \i{String Constants}
  1232. String constants are character strings used in the context of some
  1233. pseudo-instructions, namely the
  1234. \I\c{DW}\I\c{DD}\I\c{DQ}\I\c{DT}\I\c{DO}\I\c{DY}\i\c{DB} family and
  1235. \i\c{INCBIN} (where it represents a filename.) They are also used in
  1236. certain preprocessor directives.
  1237. A string constant looks like a character constant, only longer. It
  1238. is treated as a concatenation of maximum-size character constants
  1239. for the conditions. So the following are equivalent:
  1240. \c db 'hello' ; string constant
  1241. \c db 'h','e','l','l','o' ; equivalent character constants
  1242. And the following are also equivalent:
  1243. \c dd 'ninechars' ; doubleword string constant
  1244. \c dd 'nine','char','s' ; becomes three doublewords
  1245. \c db 'ninechars',0,0,0 ; and really looks like this
  1246. Note that when used in a string-supporting context, quoted strings are
  1247. treated as a string constants even if they are short enough to be a
  1248. character constant, because otherwise \c{db 'ab'} would have the same
  1249. effect as \c{db 'a'}, which would be silly. Similarly, three-character
  1250. or four-character constants are treated as strings when they are
  1251. operands to \c{DW}, and so forth.
  1252. \S{unicode} \I{UTF-16}\I{UTF-32}\i{Unicode} Strings
  1253. The special operators \i\c{__utf16__}, \i\c{__utf16le__},
  1254. \i\c{__utf16be__}, \i\c{__utf32__}, \i\c{__utf32le__} and
  1255. \i\c{__utf32be__} allows definition of Unicode strings. They take a
  1256. string in UTF-8 format and converts it to UTF-16 or UTF-32,
  1257. respectively. Unless the \c{be} forms are specified, the output is
  1258. littleendian.
  1259. For example:
  1260. \c %define u(x) __utf16__(x)
  1261. \c %define w(x) __utf32__(x)
  1262. \c
  1263. \c dw u('C:\WINDOWS'), 0 ; Pathname in UTF-16
  1264. \c dd w(`A + B = \u206a`), 0 ; String in UTF-32
  1265. The UTF operators can be applied either to strings passed to the
  1266. \c{DB} family instructions, or to character constants in an expression
  1267. context.
  1268. \S{fltconst} \I{floating-point, constants}Floating-Point Constants
  1269. \i{Floating-point} constants are acceptable only as arguments to
  1270. \i\c{DB}, \i\c{DW}, \i\c{DD}, \i\c{DQ}, \i\c{DT}, and \i\c{DO}, or as
  1271. arguments to the special operators \i\c{__float8__},
  1272. \i\c{__float16__}, \i\c{__float32__}, \i\c{__float64__},
  1273. \i\c{__float80m__}, \i\c{__float80e__}, \i\c{__float128l__}, and
  1274. \i\c{__float128h__}.
  1275. Floating-point constants are expressed in the traditional form:
  1276. digits, then a period, then optionally more digits, then optionally an
  1277. \c{E} followed by an exponent. The period is mandatory, so that NASM
  1278. can distinguish between \c{dd 1}, which declares an integer constant,
  1279. and \c{dd 1.0} which declares a floating-point constant.
  1280. NASM also support C99-style hexadecimal floating-point: \c{0x},
  1281. hexadecimal digits, period, optionally more hexadeximal digits, then
  1282. optionally a \c{P} followed by a \e{binary} (not hexadecimal) exponent
  1283. in decimal notation. As an extension, NASM additionally supports the
  1284. \c{0h} and \c{$} prefixes for hexadecimal, as well binary and octal
  1285. floating-point, using the \c{0b} or \c{0y} and \c{0o} or \c{0q}
  1286. prefixes, respectively.
  1287. Underscores to break up groups of digits are permitted in
  1288. floating-point constants as well.
  1289. Some examples:
  1290. \c db -0.2 ; "Quarter precision"
  1291. \c dw -0.5 ; IEEE 754r/SSE5 half precision
  1292. \c dd 1.2 ; an easy one
  1293. \c dd 1.222_222_222 ; underscores are permitted
  1294. \c dd 0x1p+2 ; 1.0x2^2 = 4.0
  1295. \c dq 0x1p+32 ; 1.0x2^32 = 4 294 967 296.0
  1296. \c dq 1.e10 ; 10 000 000 000.0
  1297. \c dq 1.e+10 ; synonymous with 1.e10
  1298. \c dq 1.e-10 ; 0.000 000 000 1
  1299. \c dt 3.141592653589793238462 ; pi
  1300. \c do 1.e+4000 ; IEEE 754r quad precision
  1301. The 8-bit "quarter-precision" floating-point format is
  1302. sign:exponent:mantissa = 1:4:3 with an exponent bias of 7. This
  1303. appears to be the most frequently used 8-bit floating-point format,
  1304. although it is not covered by any formal standard. This is sometimes
  1305. called a "\i{minifloat}."
  1306. The special operators are used to produce floating-point numbers in
  1307. other contexts. They produce the binary representation of a specific
  1308. floating-point number as an integer, and can use anywhere integer
  1309. constants are used in an expression. \c{__float80m__} and
  1310. \c{__float80e__} produce the 64-bit mantissa and 16-bit exponent of an
  1311. 80-bit floating-point number, and \c{__float128l__} and
  1312. \c{__float128h__} produce the lower and upper 64-bit halves of a 128-bit
  1313. floating-point number, respectively.
  1314. For example:
  1315. \c mov rax,__float64__(3.141592653589793238462)
  1316. ... would assign the binary representation of pi as a 64-bit floating
  1317. point number into \c{RAX}. This is exactly equivalent to:
  1318. \c mov rax,0x400921fb54442d18
  1319. NASM cannot do compile-time arithmetic on floating-point constants.
  1320. This is because NASM is designed to be portable - although it always
  1321. generates code to run on x86 processors, the assembler itself can
  1322. run on any system with an ANSI C compiler. Therefore, the assembler
  1323. cannot guarantee the presence of a floating-point unit capable of
  1324. handling the \i{Intel number formats}, and so for NASM to be able to
  1325. do floating arithmetic it would have to include its own complete set
  1326. of floating-point routines, which would significantly increase the
  1327. size of the assembler for very little benefit.
  1328. The special tokens \i\c{__Infinity__}, \i\c{__QNaN__} (or
  1329. \i\c{__NaN__}) and \i\c{__SNaN__} can be used to generate
  1330. \I{infinity}infinities, quiet \i{NaN}s, and signalling NaNs,
  1331. respectively. These are normally used as macros:
  1332. \c %define Inf __Infinity__
  1333. \c %define NaN __QNaN__
  1334. \c
  1335. \c dq +1.5, -Inf, NaN ; Double-precision constants
  1336. The \c{%use fp} standard macro package contains a set of convenience
  1337. macros. See \k{pkg_fp}.
  1338. \S{bcdconst} \I{floating-point, packed BCD constants}Packed BCD Constants
  1339. x87-style packed BCD constants can be used in the same contexts as
  1340. 80-bit floating-point numbers. They are suffixed with \c{p} or
  1341. prefixed with \c{0p}, and can include up to 18 decimal digits.
  1342. As with other numeric constants, underscores can be used to separate
  1343. digits.
  1344. For example:
  1345. \c dt 12_345_678_901_245_678p
  1346. \c dt -12_345_678_901_245_678p
  1347. \c dt +0p33
  1348. \c dt 33p
  1349. \H{expr} \i{Expressions}
  1350. Expressions in NASM are similar in syntax to those in C. Expressions
  1351. are evaluated as 64-bit integers which are then adjusted to the
  1352. appropriate size.
  1353. NASM supports two special tokens in expressions, allowing
  1354. calculations to involve the current assembly position: the
  1355. \I{$, here}\c{$} and \i\c{$$} tokens. \c{$} evaluates to the assembly
  1356. position at the beginning of the line containing the expression; so
  1357. you can code an \i{infinite loop} using \c{JMP $}. \c{$$} evaluates
  1358. to the beginning of the current section; so you can tell how far
  1359. into the section you are by using \c{($-$$)}.
  1360. The arithmetic \i{operators} provided by NASM are listed here, in
  1361. increasing order of \i{precedence}.
  1362. \S{expor} \i\c{|}: \i{Bitwise OR} Operator
  1363. The \c{|} operator gives a bitwise OR, exactly as performed by the
  1364. \c{OR} machine instruction. Bitwise OR is the lowest-priority
  1365. arithmetic operator supported by NASM.
  1366. \S{expxor} \i\c{^}: \i{Bitwise XOR} Operator
  1367. \c{^} provides the bitwise XOR operation.
  1368. \S{expand} \i\c{&}: \i{Bitwise AND} Operator
  1369. \c{&} provides the bitwise AND operation.
  1370. \S{expshift} \i\c{<<} and \i\c{>>}: \i{Bit Shift} Operators
  1371. \c{<<} gives a bit-shift to the left, just as it does in C. So \c{5<<3}
  1372. evaluates to 5 times 8, or 40. \c{>>} gives a bit-shift to the
  1373. right; in NASM, such a shift is \e{always} unsigned, so that
  1374. the bits shifted in from the left-hand end are filled with zero
  1375. rather than a sign-extension of the previous highest bit.
  1376. \S{expplmi} \I{+ opaddition}\c{+} and \I{- opsubtraction}\c{-}:
  1377. \i{Addition} and \i{Subtraction} Operators
  1378. The \c{+} and \c{-} operators do perfectly ordinary addition and
  1379. subtraction.
  1380. \S{expmul} \i\c{*}, \i\c{/}, \i\c{//}, \i\c{%} and \i\c{%%}:
  1381. \i{Multiplication} and \i{Division}
  1382. \c{*} is the multiplication operator. \c{/} and \c{//} are both
  1383. division operators: \c{/} is \i{unsigned division} and \c{//} is
  1384. \i{signed division}. Similarly, \c{%} and \c{%%} provide \I{unsigned
  1385. modulo}\I{modulo operators}unsigned and
  1386. \i{signed modulo} operators respectively.
  1387. NASM, like ANSI C, provides no guarantees about the sensible
  1388. operation of the signed modulo operator.
  1389. Since the \c{%} character is used extensively by the macro
  1390. \i{preprocessor}, you should ensure that both the signed and unsigned
  1391. modulo operators are followed by white space wherever they appear.
  1392. \S{expmul} \i{Unary Operators}
  1393. The highest-priority operators in NASM's expression grammar are those
  1394. which only apply to one argument. These are \I{+ opunary}\c{+}, \I{-
  1395. opunary}\c{-}, \i\c{~}, \I{! opunary}\c{!}, \i\c{SEG}, and the
  1396. \i{integer functions} operators.
  1397. \c{-} negates its operand, \c{+} does nothing (it's provided for
  1398. symmetry with \c{-}), \c{~} computes the \i{one's complement} of its
  1399. operand, \c{!} is the \i{logical negation} operator.
  1400. \c{SEG} provides the \i{segment address}
  1401. of its operand (explained in more detail in \k{segwrt}).
  1402. A set of additional operators with leading and trailing double
  1403. underscores are used to implement the integer functions of the
  1404. \c{ifunc} macro package, see \k{pkg_ifunc}.
  1405. \H{segwrt} \i\c{SEG} and \i\c{WRT}
  1406. When writing large 16-bit programs, which must be split into
  1407. multiple \i{segments}, it is often necessary to be able to refer to
  1408. the \I{segment address}segment part of the address of a symbol. NASM
  1409. supports the \c{SEG} operator to perform this function.
  1410. The \c{SEG} operator returns the \i\e{preferred} segment base of a
  1411. symbol, defined as the segment base relative to which the offset of
  1412. the symbol makes sense. So the code
  1413. \c mov ax,seg symbol
  1414. \c mov es,ax
  1415. \c mov bx,symbol
  1416. will load \c{ES:BX} with a valid pointer to the symbol \c{symbol}.
  1417. Things can be more complex than this: since 16-bit segments and
  1418. \i{groups} may \I{overlapping segments}overlap, you might occasionally
  1419. want to refer to some symbol using a different segment base from the
  1420. preferred one. NASM lets you do this, by the use of the \c{WRT}
  1421. (With Reference To) keyword. So you can do things like
  1422. \c mov ax,weird_seg ; weird_seg is a segment base
  1423. \c mov es,ax
  1424. \c mov bx,symbol wrt weird_seg
  1425. to load \c{ES:BX} with a different, but functionally equivalent,
  1426. pointer to the symbol \c{symbol}.
  1427. NASM supports far (inter-segment) calls and jumps by means of the
  1428. syntax \c{call segment:offset}, where \c{segment} and \c{offset}
  1429. both represent immediate values. So to call a far procedure, you
  1430. could code either of
  1431. \c call (seg procedure):procedure
  1432. \c call weird_seg:(procedure wrt weird_seg)
  1433. (The parentheses are included for clarity, to show the intended
  1434. parsing of the above instructions. They are not necessary in
  1435. practice.)
  1436. NASM supports the syntax \I\c{CALL FAR}\c{call far procedure} as a
  1437. synonym for the first of the above usages. \c{JMP} works identically
  1438. to \c{CALL} in these examples.
  1439. To declare a \i{far pointer} to a data item in a data segment, you
  1440. must code
  1441. \c dw symbol, seg symbol
  1442. NASM supports no convenient synonym for this, though you can always
  1443. invent one using the macro processor.
  1444. \H{strict} \i\c{STRICT}: Inhibiting Optimization
  1445. When assembling with the optimizer set to level 2 or higher (see
  1446. \k{opt-O}), NASM will use size specifiers (\c{BYTE}, \c{WORD},
  1447. \c{DWORD}, \c{QWORD}, \c{TWORD}, \c{OWORD}, \c{YWORD} or \c{ZWORD}),
  1448. but will give them the smallest possible size. The keyword \c{STRICT}
  1449. can be used to inhibit optimization and force a particular operand to
  1450. be emitted in the specified size. For example, with the optimizer on,
  1451. and in \c{BITS 16} mode,
  1452. \c push dword 33
  1453. is encoded in three bytes \c{66 6A 21}, whereas
  1454. \c push strict dword 33
  1455. is encoded in six bytes, with a full dword immediate operand \c{66 68
  1456. 21 00 00 00}.
  1457. With the optimizer off, the same code (six bytes) is generated whether
  1458. the \c{STRICT} keyword was used or not.
  1459. \H{crit} \i{Critical Expressions}
  1460. Although NASM has an optional multi-pass optimizer, there are some
  1461. expressions which must be resolvable on the first pass. These are
  1462. called \e{Critical Expressions}.
  1463. The first pass is used to determine the size of all the assembled
  1464. code and data, so that the second pass, when generating all the
  1465. code, knows all the symbol addresses the code refers to. So one
  1466. thing NASM can't handle is code whose size depends on the value of a
  1467. symbol declared after the code in question. For example,
  1468. \c times (label-$) db 0
  1469. \c label: db 'Where am I?'
  1470. The argument to \i\c{TIMES} in this case could equally legally
  1471. evaluate to anything at all; NASM will reject this example because
  1472. it cannot tell the size of the \c{TIMES} line when it first sees it.
  1473. It will just as firmly reject the slightly \I{paradox}paradoxical
  1474. code
  1475. \c times (label-$+1) db 0
  1476. \c label: db 'NOW where am I?'
  1477. in which \e{any} value for the \c{TIMES} argument is by definition
  1478. wrong!
  1479. NASM rejects these examples by means of a concept called a
  1480. \e{critical expression}, which is defined to be an expression whose
  1481. value is required to be computable in the first pass, and which must
  1482. therefore depend only on symbols defined before it. The argument to
  1483. the \c{TIMES} prefix is a critical expression.
  1484. \H{locallab} \i{Local Labels}
  1485. NASM gives special treatment to symbols beginning with a \i{period}.
  1486. A label beginning with a single period is treated as a \e{local}
  1487. label, which means that it is associated with the previous non-local
  1488. label. So, for example:
  1489. \c label1 ; some code
  1490. \c
  1491. \c .loop
  1492. \c ; some more code
  1493. \c
  1494. \c jne .loop
  1495. \c ret
  1496. \c
  1497. \c label2 ; some code
  1498. \c
  1499. \c .loop
  1500. \c ; some more code
  1501. \c
  1502. \c jne .loop
  1503. \c ret
  1504. In the above code fragment, each \c{JNE} instruction jumps to the
  1505. line immediately before it, because the two definitions of \c{.loop}
  1506. are kept separate by virtue of each being associated with the
  1507. previous non-local label.
  1508. This form of local label handling is borrowed from the old Amiga
  1509. assembler \i{DevPac}; however, NASM goes one step further, in
  1510. allowing access to local labels from other parts of the code. This
  1511. is achieved by means of \e{defining} a local label in terms of the
  1512. previous non-local label: the first definition of \c{.loop} above is
  1513. really defining a symbol called \c{label1.loop}, and the second
  1514. defines a symbol called \c{label2.loop}. So, if you really needed
  1515. to, you could write
  1516. \c label3 ; some more code
  1517. \c ; and some more
  1518. \c
  1519. \c jmp label1.loop
  1520. Sometimes it is useful - in a macro, for instance - to be able to
  1521. define a label which can be referenced from anywhere but which
  1522. doesn't interfere with the normal local-label mechanism. Such a
  1523. label can't be non-local because it would interfere with subsequent
  1524. definitions of, and references to, local labels; and it can't be
  1525. local because the macro that defined it wouldn't know the label's
  1526. full name. NASM therefore introduces a third type of label, which is
  1527. probably only useful in macro definitions: if a label begins with
  1528. the \I{label prefix}special prefix \i\c{..@}, then it does nothing
  1529. to the local label mechanism. So you could code
  1530. \c label1: ; a non-local label
  1531. \c .local: ; this is really label1.local
  1532. \c ..@foo: ; this is a special symbol
  1533. \c label2: ; another non-local label
  1534. \c .local: ; this is really label2.local
  1535. \c
  1536. \c jmp ..@foo ; this will jump three lines up
  1537. NASM has the capacity to define other special symbols beginning with
  1538. a double period: for example, \c{..start} is used to specify the
  1539. entry point in the \c{obj} output format (see \k{dotdotstart}),
  1540. \c{..imagebase} is used to find out the offset from a base address
  1541. of the current image in the \c{win64} output format (see \k{win64pic}).
  1542. So just keep in mind that symbols beginning with a double period are
  1543. special.
  1544. \C{preproc} The NASM \i{Preprocessor}
  1545. NASM contains a powerful \i{macro processor}, which supports
  1546. conditional assembly, multi-level file inclusion, two forms of macro
  1547. (single-line and multi-line), and a `context stack' mechanism for
  1548. extra macro power. Preprocessor directives all begin with a \c{%}
  1549. sign.
  1550. The preprocessor collapses all lines which end with a backslash (\\)
  1551. character into a single line. Thus:
  1552. \c %define THIS_VERY_LONG_MACRO_NAME_IS_DEFINED_TO \\
  1553. \c THIS_VALUE
  1554. will work like a single-line macro without the backslash-newline
  1555. sequence.
  1556. \H{slmacro} \i{Single-Line Macros}
  1557. \S{define} The Normal Way: \I\c{%idefine}\i\c{%define}
  1558. Single-line macros are defined using the \c{%define} preprocessor
  1559. directive. The definitions work in a similar way to C; so you can do
  1560. things like
  1561. \c %define ctrl 0x1F &
  1562. \c %define param(a,b) ((a)+(a)*(b))
  1563. \c
  1564. \c mov byte [param(2,ebx)], ctrl 'D'
  1565. which will expand to
  1566. \c mov byte [(2)+(2)*(ebx)], 0x1F & 'D'
  1567. When the expansion of a single-line macro contains tokens which
  1568. invoke another macro, the expansion is performed at invocation time,
  1569. not at definition time. Thus the code
  1570. \c %define a(x) 1+b(x)
  1571. \c %define b(x) 2*x
  1572. \c
  1573. \c mov ax,a(8)
  1574. will evaluate in the expected way to \c{mov ax,1+2*8}, even though
  1575. the macro \c{b} wasn't defined at the time of definition of \c{a}.
  1576. Macros defined with \c{%define} are \i{case sensitive}: after
  1577. \c{%define foo bar}, only \c{foo} will expand to \c{bar}: \c{Foo} or
  1578. \c{FOO} will not. By using \c{%idefine} instead of \c{%define} (the
  1579. `i' stands for `insensitive') you can define all the case variants
  1580. of a macro at once, so that \c{%idefine foo bar} would cause
  1581. \c{foo}, \c{Foo}, \c{FOO}, \c{fOO} and so on all to expand to
  1582. \c{bar}.
  1583. There is a mechanism which detects when a macro call has occurred as
  1584. a result of a previous expansion of the same macro, to guard against
  1585. \i{circular references} and infinite loops. If this happens, the
  1586. preprocessor will only expand the first occurrence of the macro.
  1587. Hence, if you code
  1588. \c %define a(x) 1+a(x)
  1589. \c
  1590. \c mov ax,a(3)
  1591. the macro \c{a(3)} will expand once, becoming \c{1+a(3)}, and will
  1592. then expand no further. This behaviour can be useful: see \k{32c}
  1593. for an example of its use.
  1594. You can \I{overloading, single-line macros}overload single-line
  1595. macros: if you write
  1596. \c %define foo(x) 1+x
  1597. \c %define foo(x,y) 1+x*y
  1598. the preprocessor will be able to handle both types of macro call,
  1599. by counting the parameters you pass; so \c{foo(3)} will become
  1600. \c{1+3} whereas \c{foo(ebx,2)} will become \c{1+ebx*2}. However, if
  1601. you define
  1602. \c %define foo bar
  1603. then no other definition of \c{foo} will be accepted: a macro with
  1604. no parameters prohibits the definition of the same name as a macro
  1605. \e{with} parameters, and vice versa.
  1606. This doesn't prevent single-line macros being \e{redefined}: you can
  1607. perfectly well define a macro with
  1608. \c %define foo bar
  1609. and then re-define it later in the same source file with
  1610. \c %define foo baz
  1611. Then everywhere the macro \c{foo} is invoked, it will be expanded
  1612. according to the most recent definition. This is particularly useful
  1613. when defining single-line macros with \c{%assign} (see \k{assign}).
  1614. You can \i{pre-define} single-line macros using the `-d' option on
  1615. the NASM command line: see \k{opt-d}.
  1616. \S{xdefine} Resolving \c{%define}: \I\c{%ixdefine}\i\c{%xdefine}
  1617. To have a reference to an embedded single-line macro resolved at the
  1618. time that the embedding macro is \e{defined}, as opposed to when the
  1619. embedding macro is \e{expanded}, you need a different mechanism to the
  1620. one offered by \c{%define}. The solution is to use \c{%xdefine}, or
  1621. it's \I{case sensitive}case-insensitive counterpart \c{%ixdefine}.
  1622. Suppose you have the following code:
  1623. \c %define isTrue 1
  1624. \c %define isFalse isTrue
  1625. \c %define isTrue 0
  1626. \c
  1627. \c val1: db isFalse
  1628. \c
  1629. \c %define isTrue 1
  1630. \c
  1631. \c val2: db isFalse
  1632. In this case, \c{val1} is equal to 0, and \c{val2} is equal to 1.
  1633. This is because, when a single-line macro is defined using
  1634. \c{%define}, it is expanded only when it is called. As \c{isFalse}
  1635. expands to \c{isTrue}, the expansion will be the current value of
  1636. \c{isTrue}. The first time it is called that is 0, and the second
  1637. time it is 1.
  1638. If you wanted \c{isFalse} to expand to the value assigned to the
  1639. embedded macro \c{isTrue} at the time that \c{isFalse} was defined,
  1640. you need to change the above code to use \c{%xdefine}.
  1641. \c %xdefine isTrue 1
  1642. \c %xdefine isFalse isTrue
  1643. \c %xdefine isTrue 0
  1644. \c
  1645. \c val1: db isFalse
  1646. \c
  1647. \c %xdefine isTrue 1
  1648. \c
  1649. \c val2: db isFalse
  1650. Now, each time that \c{isFalse} is called, it expands to 1,
  1651. as that is what the embedded macro \c{isTrue} expanded to at
  1652. the time that \c{isFalse} was defined.
  1653. \S{indmacro} \i{Macro Indirection}: \I\c{%[}\c{%[...]}
  1654. The \c{%[...]} construct can be used to expand macros in contexts
  1655. where macro expansion would otherwise not occur, including in the
  1656. names other macros. For example, if you have a set of macros named
  1657. \c{Foo16}, \c{Foo32} and \c{Foo64}, you could write:
  1658. \c mov ax,Foo%[__BITS__] ; The Foo value
  1659. to use the builtin macro \c{__BITS__} (see \k{bitsm}) to automatically
  1660. select between them. Similarly, the two statements:
  1661. \c %xdefine Bar Quux ; Expands due to %xdefine
  1662. \c %define Bar %[Quux] ; Expands due to %[...]
  1663. have, in fact, exactly the same effect.
  1664. \c{%[...]} concatenates to adjacent tokens in the same way that
  1665. multi-line macro parameters do, see \k{concat} for details.
  1666. \S{concat%+} Concatenating Single Line Macro Tokens: \i\c{%+}
  1667. Individual tokens in single line macros can be concatenated, to produce
  1668. longer tokens for later processing. This can be useful if there are
  1669. several similar macros that perform similar functions.
  1670. Please note that a space is required after \c{%+}, in order to
  1671. disambiguate it from the syntax \c{%+1} used in multiline macros.
  1672. As an example, consider the following:
  1673. \c %define BDASTART 400h ; Start of BIOS data area
  1674. \c struc tBIOSDA ; its structure
  1675. \c .COM1addr RESW 1
  1676. \c .COM2addr RESW 1
  1677. \c ; ..and so on
  1678. \c endstruc
  1679. Now, if we need to access the elements of tBIOSDA in different places,
  1680. we can end up with:
  1681. \c mov ax,BDASTART + tBIOSDA.COM1addr
  1682. \c mov bx,BDASTART + tBIOSDA.COM2addr
  1683. This will become pretty ugly (and tedious) if used in many places, and
  1684. can be reduced in size significantly by using the following macro:
  1685. \c ; Macro to access BIOS variables by their names (from tBDA):
  1686. \c %define BDA(x) BDASTART + tBIOSDA. %+ x
  1687. Now the above code can be written as:
  1688. \c mov ax,BDA(COM1addr)
  1689. \c mov bx,BDA(COM2addr)
  1690. Using this feature, we can simplify references to a lot of macros (and,
  1691. in turn, reduce typing errors).
  1692. \S{selfref%?} The Macro Name Itself: \i\c{%?} and \i\c{%??}
  1693. The special symbols \c{%?} and \c{%??} can be used to reference the
  1694. macro name itself inside a macro expansion, this is supported for both
  1695. single-and multi-line macros. \c{%?} refers to the macro name as
  1696. \e{invoked}, whereas \c{%??} refers to the macro name as
  1697. \e{declared}. The two are always the same for case-sensitive
  1698. macros, but for case-insensitive macros, they can differ.
  1699. For example:
  1700. \c %idefine Foo mov %?,%??
  1701. \c
  1702. \c foo
  1703. \c FOO
  1704. will expand to:
  1705. \c mov foo,Foo
  1706. \c mov FOO,Foo
  1707. The sequence:
  1708. \c %idefine keyword $%?
  1709. can be used to make a keyword "disappear", for example in case a new
  1710. instruction has been used as a label in older code. For example:
  1711. \c %idefine pause $%? ; Hide the PAUSE instruction
  1712. \S{undef} Undefining Single-Line Macros: \i\c{%undef}
  1713. Single-line macros can be removed with the \c{%undef} directive. For
  1714. example, the following sequence:
  1715. \c %define foo bar
  1716. \c %undef foo
  1717. \c
  1718. \c mov eax, foo
  1719. will expand to the instruction \c{mov eax, foo}, since after
  1720. \c{%undef} the macro \c{foo} is no longer defined.
  1721. Macros that would otherwise be pre-defined can be undefined on the
  1722. command-line using the `-u' option on the NASM command line: see
  1723. \k{opt-u}.
  1724. \S{assign} \i{Preprocessor Variables}: \i\c{%assign}
  1725. An alternative way to define single-line macros is by means of the
  1726. \c{%assign} command (and its \I{case sensitive}case-insensitive
  1727. counterpart \i\c{%iassign}, which differs from \c{%assign} in
  1728. exactly the same way that \c{%idefine} differs from \c{%define}).
  1729. \c{%assign} is used to define single-line macros which take no
  1730. parameters and have a numeric value. This value can be specified in
  1731. the form of an expression, and it will be evaluated once, when the
  1732. \c{%assign} directive is processed.
  1733. Like \c{%define}, macros defined using \c{%assign} can be re-defined
  1734. later, so you can do things like
  1735. \c %assign i i+1
  1736. to increment the numeric value of a macro.
  1737. \c{%assign} is useful for controlling the termination of \c{%rep}
  1738. preprocessor loops: see \k{rep} for an example of this. Another
  1739. use for \c{%assign} is given in \k{16c} and \k{32c}.
  1740. The expression passed to \c{%assign} is a \i{critical expression}
  1741. (see \k{crit}), and must also evaluate to a pure number (rather than
  1742. a relocatable reference such as a code or data address, or anything
  1743. involving a register).
  1744. \S{defstr} Defining Strings: \I\c{%idefstr}\i\c{%defstr}
  1745. \c{%defstr}, and its case-insensitive counterpart \c{%idefstr}, define
  1746. or redefine a single-line macro without parameters but converts the
  1747. entire right-hand side, after macro expansion, to a quoted string
  1748. before definition.
  1749. For example:
  1750. \c %defstr test TEST
  1751. is equivalent to
  1752. \c %define test 'TEST'
  1753. This can be used, for example, with the \c{%!} construct (see
  1754. \k{getenv}):
  1755. \c %defstr PATH %!PATH ; The operating system PATH variable
  1756. \S{deftok} Defining Tokens: \I\c{%ideftok}\i\c{%deftok}
  1757. \c{%deftok}, and its case-insensitive counterpart \c{%ideftok}, define
  1758. or redefine a single-line macro without parameters but converts the
  1759. second parameter, after string conversion, to a sequence of tokens.
  1760. For example:
  1761. \c %deftok test 'TEST'
  1762. is equivalent to
  1763. \c %define test TEST
  1764. \H{strlen} \i{String Manipulation in Macros}
  1765. It's often useful to be able to handle strings in macros. NASM
  1766. supports a few simple string handling macro operators from which
  1767. more complex operations can be constructed.
  1768. All the string operators define or redefine a value (either a string
  1769. or a numeric value) to a single-line macro. When producing a string
  1770. value, it may change the style of quoting of the input string or
  1771. strings, and possibly use \c{\\}-escapes inside \c{`}-quoted strings.
  1772. \S{strcat} \i{Concatenating Strings}: \i\c{%strcat}
  1773. The \c{%strcat} operator concatenates quoted strings and assign them to
  1774. a single-line macro.
  1775. For example:
  1776. \c %strcat alpha "Alpha: ", '12" screen'
  1777. ... would assign the value \c{'Alpha: 12" screen'} to \c{alpha}.
  1778. Similarly:
  1779. \c %strcat beta '"foo"\', "'bar'"
  1780. ... would assign the value \c{`"foo"\\\\'bar'`} to \c{beta}.
  1781. The use of commas to separate strings is permitted but optional.
  1782. \S{strlen} \i{String Length}: \i\c{%strlen}
  1783. The \c{%strlen} operator assigns the length of a string to a macro.
  1784. For example:
  1785. \c %strlen charcnt 'my string'
  1786. In this example, \c{charcnt} would receive the value 9, just as
  1787. if an \c{%assign} had been used. In this example, \c{'my string'}
  1788. was a literal string but it could also have been a single-line
  1789. macro that expands to a string, as in the following example:
  1790. \c %define sometext 'my string'
  1791. \c %strlen charcnt sometext
  1792. As in the first case, this would result in \c{charcnt} being
  1793. assigned the value of 9.
  1794. \S{substr} \i{Extracting Substrings}: \i\c{%substr}
  1795. Individual letters or substrings in strings can be extracted using the
  1796. \c{%substr} operator. An example of its use is probably more useful
  1797. than the description:
  1798. \c %substr mychar 'xyzw' 1 ; equivalent to %define mychar 'x'
  1799. \c %substr mychar 'xyzw' 2 ; equivalent to %define mychar 'y'
  1800. \c %substr mychar 'xyzw' 3 ; equivalent to %define mychar 'z'
  1801. \c %substr mychar 'xyzw' 2,2 ; equivalent to %define mychar 'yz'
  1802. \c %substr mychar 'xyzw' 2,-1 ; equivalent to %define mychar 'yzw'
  1803. \c %substr mychar 'xyzw' 2,-2 ; equivalent to %define mychar 'yz'
  1804. As with \c{%strlen} (see \k{strlen}), the first parameter is the
  1805. single-line macro to be created and the second is the string. The
  1806. third parameter specifies the first character to be selected, and the
  1807. optional fourth parameter preceeded by comma) is the length. Note
  1808. that the first index is 1, not 0 and the last index is equal to the
  1809. value that \c{%strlen} would assign given the same string. Index
  1810. values out of range result in an empty string. A negative length
  1811. means "until N-1 characters before the end of string", i.e. \c{-1}
  1812. means until end of string, \c{-2} until one character before, etc.
  1813. \H{mlmacro} \i{Multi-Line Macros}: \I\c{%imacro}\i\c{%macro}
  1814. Multi-line macros are much more like the type of macro seen in MASM
  1815. and TASM: a multi-line macro definition in NASM looks something like
  1816. this.
  1817. \c %macro prologue 1
  1818. \c
  1819. \c push ebp
  1820. \c mov ebp,esp
  1821. \c sub esp,%1
  1822. \c
  1823. \c %endmacro
  1824. This defines a C-like function prologue as a macro: so you would
  1825. invoke the macro with a call such as
  1826. \c myfunc: prologue 12
  1827. which would expand to the three lines of code
  1828. \c myfunc: push ebp
  1829. \c mov ebp,esp
  1830. \c sub esp,12
  1831. The number \c{1} after the macro name in the \c{%macro} line defines
  1832. the number of parameters the macro \c{prologue} expects to receive.
  1833. The use of \c{%1} inside the macro definition refers to the first
  1834. parameter to the macro call. With a macro taking more than one
  1835. parameter, subsequent parameters would be referred to as \c{%2},
  1836. \c{%3} and so on.
  1837. Multi-line macros, like single-line macros, are \i{case-sensitive},
  1838. unless you define them using the alternative directive \c{%imacro}.
  1839. If you need to pass a comma as \e{part} of a parameter to a
  1840. multi-line macro, you can do that by enclosing the entire parameter
  1841. in \I{braces, around macro parameters}braces. So you could code
  1842. things like
  1843. \c %macro silly 2
  1844. \c
  1845. \c %2: db %1
  1846. \c
  1847. \c %endmacro
  1848. \c
  1849. \c silly 'a', letter_a ; letter_a: db 'a'
  1850. \c silly 'ab', string_ab ; string_ab: db 'ab'
  1851. \c silly {13,10}, crlf ; crlf: db 13,10
  1852. \S{mlmacover} Overloading Multi-Line Macros\I{overloading, multi-line macros}
  1853. As with single-line macros, multi-line macros can be overloaded by
  1854. defining the same macro name several times with different numbers of
  1855. parameters. This time, no exception is made for macros with no
  1856. parameters at all. So you could define
  1857. \c %macro prologue 0
  1858. \c
  1859. \c push ebp
  1860. \c mov ebp,esp
  1861. \c
  1862. \c %endmacro
  1863. to define an alternative form of the function prologue which
  1864. allocates no local stack space.
  1865. Sometimes, however, you might want to `overload' a machine
  1866. instruction; for example, you might want to define
  1867. \c %macro push 2
  1868. \c
  1869. \c push %1
  1870. \c push %2
  1871. \c
  1872. \c %endmacro
  1873. so that you could code
  1874. \c push ebx ; this line is not a macro call
  1875. \c push eax,ecx ; but this one is
  1876. Ordinarily, NASM will give a warning for the first of the above two
  1877. lines, since \c{push} is now defined to be a macro, and is being
  1878. invoked with a number of parameters for which no definition has been
  1879. given. The correct code will still be generated, but the assembler
  1880. will give a warning. This warning can be disabled by the use of the
  1881. \c{-w-macro-params} command-line option (see \k{opt-w}).
  1882. \S{maclocal} \i{Macro-Local Labels}
  1883. NASM allows you to define labels within a multi-line macro
  1884. definition in such a way as to make them local to the macro call: so
  1885. calling the same macro multiple times will use a different label
  1886. each time. You do this by prefixing \i\c{%%} to the label name. So
  1887. you can invent an instruction which executes a \c{RET} if the \c{Z}
  1888. flag is set by doing this:
  1889. \c %macro retz 0
  1890. \c
  1891. \c jnz %%skip
  1892. \c ret
  1893. \c %%skip:
  1894. \c
  1895. \c %endmacro
  1896. You can call this macro as many times as you want, and every time
  1897. you call it NASM will make up a different `real' name to substitute
  1898. for the label \c{%%skip}. The names NASM invents are of the form
  1899. \c{..@2345.skip}, where the number 2345 changes with every macro
  1900. call. The \i\c{..@} prefix prevents macro-local labels from
  1901. interfering with the local label mechanism, as described in
  1902. \k{locallab}. You should avoid defining your own labels in this form
  1903. (the \c{..@} prefix, then a number, then another period) in case
  1904. they interfere with macro-local labels.
  1905. \S{mlmacgre} \i{Greedy Macro Parameters}
  1906. Occasionally it is useful to define a macro which lumps its entire
  1907. command line into one parameter definition, possibly after
  1908. extracting one or two smaller parameters from the front. An example
  1909. might be a macro to write a text string to a file in MS-DOS, where
  1910. you might want to be able to write
  1911. \c writefile [filehandle],"hello, world",13,10
  1912. NASM allows you to define the last parameter of a macro to be
  1913. \e{greedy}, meaning that if you invoke the macro with more
  1914. parameters than it expects, all the spare parameters get lumped into
  1915. the last defined one along with the separating commas. So if you
  1916. code:
  1917. \c %macro writefile 2+
  1918. \c
  1919. \c jmp %%endstr
  1920. \c %%str: db %2
  1921. \c %%endstr:
  1922. \c mov dx,%%str
  1923. \c mov cx,%%endstr-%%str
  1924. \c mov bx,%1
  1925. \c mov ah,0x40
  1926. \c int 0x21
  1927. \c
  1928. \c %endmacro
  1929. then the example call to \c{writefile} above will work as expected:
  1930. the text before the first comma, \c{[filehandle]}, is used as the
  1931. first macro parameter and expanded when \c{%1} is referred to, and
  1932. all the subsequent text is lumped into \c{%2} and placed after the
  1933. \c{db}.
  1934. The greedy nature of the macro is indicated to NASM by the use of
  1935. the \I{+ modifier}\c{+} sign after the parameter count on the
  1936. \c{%macro} line.
  1937. If you define a greedy macro, you are effectively telling NASM how
  1938. it should expand the macro given \e{any} number of parameters from
  1939. the actual number specified up to infinity; in this case, for
  1940. example, NASM now knows what to do when it sees a call to
  1941. \c{writefile} with 2, 3, 4 or more parameters. NASM will take this
  1942. into account when overloading macros, and will not allow you to
  1943. define another form of \c{writefile} taking 4 parameters (for
  1944. example).
  1945. Of course, the above macro could have been implemented as a
  1946. non-greedy macro, in which case the call to it would have had to
  1947. look like
  1948. \c writefile [filehandle], {"hello, world",13,10}
  1949. NASM provides both mechanisms for putting \i{commas in macro
  1950. parameters}, and you choose which one you prefer for each macro
  1951. definition.
  1952. See \k{sectmac} for a better way to write the above macro.
  1953. \S{mlmacrange} \i{Macro Parameters Range}
  1954. NASM allows you to expand parameters via special construction \c{%\{x:y\}}
  1955. where \c{x} is the first parameter index and \c{y} is the last. Any index can
  1956. be either negative or positive but must never be zero.
  1957. For example
  1958. \c %macro mpar 1-*
  1959. \c db %{3:5}
  1960. \c %endmacro
  1961. \c
  1962. \c mpar 1,2,3,4,5,6
  1963. expands to \c{3,4,5} range.
  1964. Even more, the parameters can be reversed so that
  1965. \c %macro mpar 1-*
  1966. \c db %{5:3}
  1967. \c %endmacro
  1968. \c
  1969. \c mpar 1,2,3,4,5,6
  1970. expands to \c{5,4,3} range.
  1971. But even this is not the last. The parameters can be addressed via negative
  1972. indices so NASM will count them reversed. The ones who know Python may see
  1973. the analogue here.
  1974. \c %macro mpar 1-*
  1975. \c db %{-1:-3}
  1976. \c %endmacro
  1977. \c
  1978. \c mpar 1,2,3,4,5,6
  1979. expands to \c{6,5,4} range.
  1980. Note that NASM uses \i{comma} to separate parameters being expanded.
  1981. By the way, here is a trick - you might use the index \c{%{-1:-1}}
  1982. which gives you the \i{last} argument passed to a macro.
  1983. \S{mlmacdef} \i{Default Macro Parameters}
  1984. NASM also allows you to define a multi-line macro with a \e{range}
  1985. of allowable parameter counts. If you do this, you can specify
  1986. defaults for \i{omitted parameters}. So, for example:
  1987. \c %macro die 0-1 "Painful program death has occurred."
  1988. \c
  1989. \c writefile 2,%1
  1990. \c mov ax,0x4c01
  1991. \c int 0x21
  1992. \c
  1993. \c %endmacro
  1994. This macro (which makes use of the \c{writefile} macro defined in
  1995. \k{mlmacgre}) can be called with an explicit error message, which it
  1996. will display on the error output stream before exiting, or it can be
  1997. called with no parameters, in which case it will use the default
  1998. error message supplied in the macro definition.
  1999. In general, you supply a minimum and maximum number of parameters
  2000. for a macro of this type; the minimum number of parameters are then
  2001. required in the macro call, and then you provide defaults for the
  2002. optional ones. So if a macro definition began with the line
  2003. \c %macro foobar 1-3 eax,[ebx+2]
  2004. then it could be called with between one and three parameters, and
  2005. \c{%1} would always be taken from the macro call. \c{%2}, if not
  2006. specified by the macro call, would default to \c{eax}, and \c{%3} if
  2007. not specified would default to \c{[ebx+2]}.
  2008. You can provide extra information to a macro by providing
  2009. too many default parameters:
  2010. \c %macro quux 1 something
  2011. This will trigger a warning by default; see \k{opt-w} for
  2012. more information.
  2013. When \c{quux} is invoked, it receives not one but two parameters.
  2014. \c{something} can be referred to as \c{%2}. The difference
  2015. between passing \c{something} this way and writing \c{something}
  2016. in the macro body is that with this way \c{something} is evaluated
  2017. when the macro is defined, not when it is expanded.
  2018. You may omit parameter defaults from the macro definition, in which
  2019. case the parameter default is taken to be blank. This can be useful
  2020. for macros which can take a variable number of parameters, since the
  2021. \i\c{%0} token (see \k{percent0}) allows you to determine how many
  2022. parameters were really passed to the macro call.
  2023. This defaulting mechanism can be combined with the greedy-parameter
  2024. mechanism; so the \c{die} macro above could be made more powerful,
  2025. and more useful, by changing the first line of the definition to
  2026. \c %macro die 0-1+ "Painful program death has occurred.",13,10
  2027. The maximum parameter count can be infinite, denoted by \c{*}. In
  2028. this case, of course, it is impossible to provide a \e{full} set of
  2029. default parameters. Examples of this usage are shown in \k{rotate}.
  2030. \S{percent0} \i\c{%0}: \I{counting macro parameters}Macro Parameter Counter
  2031. The parameter reference \c{%0} will return a numeric constant giving the
  2032. number of parameters received, that is, if \c{%0} is n then \c{%}n is the
  2033. last parameter. \c{%0} is mostly useful for macros that can take a variable
  2034. number of parameters. It can be used as an argument to \c{%rep}
  2035. (see \k{rep}) in order to iterate through all the parameters of a macro.
  2036. Examples are given in \k{rotate}.
  2037. \S{percent00} \i\c{%00}: \I{label preceeding macro}Label Preceeding Macro
  2038. \c{%00} will return the label preceeding the macro invocation, if any. The
  2039. label must be on the same line as the macro invocation, may be a local label
  2040. (see \k{locallab}), and need not end in a colon.
  2041. \S{rotate} \i\c{%rotate}: \i{Rotating Macro Parameters}
  2042. Unix shell programmers will be familiar with the \I{shift
  2043. command}\c{shift} shell command, which allows the arguments passed
  2044. to a shell script (referenced as \c{$1}, \c{$2} and so on) to be
  2045. moved left by one place, so that the argument previously referenced
  2046. as \c{$2} becomes available as \c{$1}, and the argument previously
  2047. referenced as \c{$1} is no longer available at all.
  2048. NASM provides a similar mechanism, in the form of \c{%rotate}. As
  2049. its name suggests, it differs from the Unix \c{shift} in that no
  2050. parameters are lost: parameters rotated off the left end of the
  2051. argument list reappear on the right, and vice versa.
  2052. \c{%rotate} is invoked with a single numeric argument (which may be
  2053. an expression). The macro parameters are rotated to the left by that
  2054. many places. If the argument to \c{%rotate} is negative, the macro
  2055. parameters are rotated to the right.
  2056. \I{iterating over macro parameters}So a pair of macros to save and
  2057. restore a set of registers might work as follows:
  2058. \c %macro multipush 1-*
  2059. \c
  2060. \c %rep %0
  2061. \c push %1
  2062. \c %rotate 1
  2063. \c %endrep
  2064. \c
  2065. \c %endmacro
  2066. This macro invokes the \c{PUSH} instruction on each of its arguments
  2067. in turn, from left to right. It begins by pushing its first
  2068. argument, \c{%1}, then invokes \c{%rotate} to move all the arguments
  2069. one place to the left, so that the original second argument is now
  2070. available as \c{%1}. Repeating this procedure as many times as there
  2071. were arguments (achieved by supplying \c{%0} as the argument to
  2072. \c{%rep}) causes each argument in turn to be pushed.
  2073. Note also the use of \c{*} as the maximum parameter count,
  2074. indicating that there is no upper limit on the number of parameters
  2075. you may supply to the \i\c{multipush} macro.
  2076. It would be convenient, when using this macro, to have a \c{POP}
  2077. equivalent, which \e{didn't} require the arguments to be given in
  2078. reverse order. Ideally, you would write the \c{multipush} macro
  2079. call, then cut-and-paste the line to where the pop needed to be
  2080. done, and change the name of the called macro to \c{multipop}, and
  2081. the macro would take care of popping the registers in the opposite
  2082. order from the one in which they were pushed.
  2083. This can be done by the following definition:
  2084. \c %macro multipop 1-*
  2085. \c
  2086. \c %rep %0
  2087. \c %rotate -1
  2088. \c pop %1
  2089. \c %endrep
  2090. \c
  2091. \c %endmacro
  2092. This macro begins by rotating its arguments one place to the
  2093. \e{right}, so that the original \e{last} argument appears as \c{%1}.
  2094. This is then popped, and the arguments are rotated right again, so
  2095. the second-to-last argument becomes \c{%1}. Thus the arguments are
  2096. iterated through in reverse order.
  2097. \S{concat} \i{Concatenating Macro Parameters}
  2098. NASM can concatenate macro parameters and macro indirection constructs
  2099. on to other text surrounding them. This allows you to declare a family
  2100. of symbols, for example, in a macro definition. If, for example, you
  2101. wanted to generate a table of key codes along with offsets into the
  2102. table, you could code something like
  2103. \c %macro keytab_entry 2
  2104. \c
  2105. \c keypos%1 equ $-keytab
  2106. \c db %2
  2107. \c
  2108. \c %endmacro
  2109. \c
  2110. \c keytab:
  2111. \c keytab_entry F1,128+1
  2112. \c keytab_entry F2,128+2
  2113. \c keytab_entry Return,13
  2114. which would expand to
  2115. \c keytab:
  2116. \c keyposF1 equ $-keytab
  2117. \c db 128+1
  2118. \c keyposF2 equ $-keytab
  2119. \c db 128+2
  2120. \c keyposReturn equ $-keytab
  2121. \c db 13
  2122. You can just as easily concatenate text on to the other end of a
  2123. macro parameter, by writing \c{%1foo}.
  2124. If you need to append a \e{digit} to a macro parameter, for example
  2125. defining labels \c{foo1} and \c{foo2} when passed the parameter
  2126. \c{foo}, you can't code \c{%11} because that would be taken as the
  2127. eleventh macro parameter. Instead, you must code
  2128. \I{braces, after % sign}\c{%\{1\}1}, which will separate the first
  2129. \c{1} (giving the number of the macro parameter) from the second
  2130. (literal text to be concatenated to the parameter).
  2131. This concatenation can also be applied to other preprocessor in-line
  2132. objects, such as macro-local labels (\k{maclocal}) and context-local
  2133. labels (\k{ctxlocal}). In all cases, ambiguities in syntax can be
  2134. resolved by enclosing everything after the \c{%} sign and before the
  2135. literal text in braces: so \c{%\{%foo\}bar} concatenates the text
  2136. \c{bar} to the end of the real name of the macro-local label
  2137. \c{%%foo}. (This is unnecessary, since the form NASM uses for the
  2138. real names of macro-local labels means that the two usages
  2139. \c{%\{%foo\}bar} and \c{%%foobar} would both expand to the same
  2140. thing anyway; nevertheless, the capability is there.)
  2141. The single-line macro indirection construct, \c{%[...]}
  2142. (\k{indmacro}), behaves the same way as macro parameters for the
  2143. purpose of concatenation.
  2144. See also the \c{%+} operator, \k{concat%+}.
  2145. \S{mlmaccc} \i{Condition Codes as Macro Parameters}
  2146. NASM can give special treatment to a macro parameter which contains
  2147. a condition code. For a start, you can refer to the macro parameter
  2148. \c{%1} by means of the alternative syntax \i\c{%+1}, which informs
  2149. NASM that this macro parameter is supposed to contain a condition
  2150. code, and will cause the preprocessor to report an error message if
  2151. the macro is called with a parameter which is \e{not} a valid
  2152. condition code.
  2153. Far more usefully, though, you can refer to the macro parameter by
  2154. means of \i\c{%-1}, which NASM will expand as the \e{inverse}
  2155. condition code. So the \c{retz} macro defined in \k{maclocal} can be
  2156. replaced by a general \i{conditional-return macro} like this:
  2157. \c %macro retc 1
  2158. \c
  2159. \c j%-1 %%skip
  2160. \c ret
  2161. \c %%skip:
  2162. \c
  2163. \c %endmacro
  2164. This macro can now be invoked using calls like \c{retc ne}, which
  2165. will cause the conditional-jump instruction in the macro expansion
  2166. to come out as \c{JE}, or \c{retc po} which will make the jump a
  2167. \c{JPE}.
  2168. The \c{%+1} macro-parameter reference is quite happy to interpret
  2169. the arguments \c{CXZ} and \c{ECXZ} as valid condition codes;
  2170. however, \c{%-1} will report an error if passed either of these,
  2171. because no inverse condition code exists.
  2172. \S{nolist} \i{Disabling Listing Expansion}\I\c{.nolist}
  2173. When NASM is generating a listing file from your program, it will
  2174. generally expand multi-line macros by means of writing the macro
  2175. call and then listing each line of the expansion. This allows you to
  2176. see which instructions in the macro expansion are generating what
  2177. code; however, for some macros this clutters the listing up
  2178. unnecessarily.
  2179. NASM therefore provides the \c{.nolist} qualifier, which you can
  2180. include in a macro definition to inhibit the expansion of the macro
  2181. in the listing file. The \c{.nolist} qualifier comes directly after
  2182. the number of parameters, like this:
  2183. \c %macro foo 1.nolist
  2184. Or like this:
  2185. \c %macro bar 1-5+.nolist a,b,c,d,e,f,g,h
  2186. \S{unmacro} Undefining Multi-Line Macros: \i\c{%unmacro}
  2187. Multi-line macros can be removed with the \c{%unmacro} directive.
  2188. Unlike the \c{%undef} directive, however, \c{%unmacro} takes an
  2189. argument specification, and will only remove \i{exact matches} with
  2190. that argument specification.
  2191. For example:
  2192. \c %macro foo 1-3
  2193. \c ; Do something
  2194. \c %endmacro
  2195. \c %unmacro foo 1-3
  2196. removes the previously defined macro \c{foo}, but
  2197. \c %macro bar 1-3
  2198. \c ; Do something
  2199. \c %endmacro
  2200. \c %unmacro bar 1
  2201. does \e{not} remove the macro \c{bar}, since the argument
  2202. specification does not match exactly.
  2203. \H{condasm} \i{Conditional Assembly}\I\c{%if}
  2204. Similarly to the C preprocessor, NASM allows sections of a source
  2205. file to be assembled only if certain conditions are met. The general
  2206. syntax of this feature looks like this:
  2207. \c %if<condition>
  2208. \c ; some code which only appears if <condition> is met
  2209. \c %elif<condition2>
  2210. \c ; only appears if <condition> is not met but <condition2> is
  2211. \c %else
  2212. \c ; this appears if neither <condition> nor <condition2> was met
  2213. \c %endif
  2214. The inverse forms \i\c{%ifn} and \i\c{%elifn} are also supported.
  2215. The \i\c{%else} clause is optional, as is the \i\c{%elif} clause.
  2216. You can have more than one \c{%elif} clause as well.
  2217. There are a number of variants of the \c{%if} directive. Each has its
  2218. corresponding \c{%elif}, \c{%ifn}, and \c{%elifn} directives; for
  2219. example, the equivalents to the \c{%ifdef} directive are \c{%elifdef},
  2220. \c{%ifndef}, and \c{%elifndef}.
  2221. \S{ifdef} \i\c{%ifdef}: Testing Single-Line Macro Existence\I{testing,
  2222. single-line macro existence}
  2223. Beginning a conditional-assembly block with the line \c{%ifdef
  2224. MACRO} will assemble the subsequent code if, and only if, a
  2225. single-line macro called \c{MACRO} is defined. If not, then the
  2226. \c{%elif} and \c{%else} blocks (if any) will be processed instead.
  2227. For example, when debugging a program, you might want to write code
  2228. such as
  2229. \c ; perform some function
  2230. \c %ifdef DEBUG
  2231. \c writefile 2,"Function performed successfully",13,10
  2232. \c %endif
  2233. \c ; go and do something else
  2234. Then you could use the command-line option \c{-dDEBUG} to create a
  2235. version of the program which produced debugging messages, and remove
  2236. the option to generate the final release version of the program.
  2237. You can test for a macro \e{not} being defined by using
  2238. \i\c{%ifndef} instead of \c{%ifdef}. You can also test for macro
  2239. definitions in \c{%elif} blocks by using \i\c{%elifdef} and
  2240. \i\c{%elifndef}.
  2241. \S{ifmacro} \i\c{%ifmacro}: Testing Multi-Line Macro
  2242. Existence\I{testing, multi-line macro existence}
  2243. The \c{%ifmacro} directive operates in the same way as the \c{%ifdef}
  2244. directive, except that it checks for the existence of a multi-line macro.
  2245. For example, you may be working with a large project and not have control
  2246. over the macros in a library. You may want to create a macro with one
  2247. name if it doesn't already exist, and another name if one with that name
  2248. does exist.
  2249. The \c{%ifmacro} is considered true if defining a macro with the given name
  2250. and number of arguments would cause a definitions conflict. For example:
  2251. \c %ifmacro MyMacro 1-3
  2252. \c
  2253. \c %error "MyMacro 1-3" causes a conflict with an existing macro.
  2254. \c
  2255. \c %else
  2256. \c
  2257. \c %macro MyMacro 1-3
  2258. \c
  2259. \c ; insert code to define the macro
  2260. \c
  2261. \c %endmacro
  2262. \c
  2263. \c %endif
  2264. This will create the macro "MyMacro 1-3" if no macro already exists which
  2265. would conflict with it, and emits a warning if there would be a definition
  2266. conflict.
  2267. You can test for the macro not existing by using the \i\c{%ifnmacro} instead
  2268. of \c{%ifmacro}. Additional tests can be performed in \c{%elif} blocks by using
  2269. \i\c{%elifmacro} and \i\c{%elifnmacro}.
  2270. \S{ifctx} \i\c{%ifctx}: Testing the Context Stack\I{testing, context
  2271. stack}
  2272. The conditional-assembly construct \c{%ifctx} will cause the
  2273. subsequent code to be assembled if and only if the top context on
  2274. the preprocessor's context stack has the same name as one of the arguments.
  2275. As with \c{%ifdef}, the inverse and \c{%elif} forms \i\c{%ifnctx},
  2276. \i\c{%elifctx} and \i\c{%elifnctx} are also supported.
  2277. For more details of the context stack, see \k{ctxstack}. For a
  2278. sample use of \c{%ifctx}, see \k{blockif}.
  2279. \S{if} \i\c{%if}: Testing Arbitrary Numeric Expressions\I{testing,
  2280. arbitrary numeric expressions}
  2281. The conditional-assembly construct \c{%if expr} will cause the
  2282. subsequent code to be assembled if and only if the value of the
  2283. numeric expression \c{expr} is non-zero. An example of the use of
  2284. this feature is in deciding when to break out of a \c{%rep}
  2285. preprocessor loop: see \k{rep} for a detailed example.
  2286. The expression given to \c{%if}, and its counterpart \i\c{%elif}, is
  2287. a critical expression (see \k{crit}).
  2288. \c{%if} extends the normal NASM expression syntax, by providing a
  2289. set of \i{relational operators} which are not normally available in
  2290. expressions. The operators \i\c{=}, \i\c{<}, \i\c{>}, \i\c{<=},
  2291. \i\c{>=} and \i\c{<>} test equality, less-than, greater-than,
  2292. less-or-equal, greater-or-equal and not-equal respectively. The
  2293. C-like forms \i\c{==} and \i\c{!=} are supported as alternative
  2294. forms of \c{=} and \c{<>}. In addition, low-priority logical
  2295. operators \i\c{&&}, \i\c{^^} and \i\c{||} are provided, supplying
  2296. \i{logical AND}, \i{logical XOR} and \i{logical OR}. These work like
  2297. the C logical operators (although C has no logical XOR), in that
  2298. they always return either 0 or 1, and treat any non-zero input as 1
  2299. (so that \c{^^}, for example, returns 1 if exactly one of its inputs
  2300. is zero, and 0 otherwise). The relational operators also return 1
  2301. for true and 0 for false.
  2302. Like other \c{%if} constructs, \c{%if} has a counterpart
  2303. \i\c{%elif}, and negative forms \i\c{%ifn} and \i\c{%elifn}.
  2304. \S{ifidn} \i\c{%ifidn} and \i\c{%ifidni}: Testing Exact Text
  2305. Identity\I{testing, exact text identity}
  2306. The construct \c{%ifidn text1,text2} will cause the subsequent code
  2307. to be assembled if and only if \c{text1} and \c{text2}, after
  2308. expanding single-line macros, are identical pieces of text.
  2309. Differences in white space are not counted.
  2310. \c{%ifidni} is similar to \c{%ifidn}, but is \i{case-insensitive}.
  2311. For example, the following macro pushes a register or number on the
  2312. stack, and allows you to treat \c{IP} as a real register:
  2313. \c %macro pushparam 1
  2314. \c
  2315. \c %ifidni %1,ip
  2316. \c call %%label
  2317. \c %%label:
  2318. \c %else
  2319. \c push %1
  2320. \c %endif
  2321. \c
  2322. \c %endmacro
  2323. Like other \c{%if} constructs, \c{%ifidn} has a counterpart
  2324. \i\c{%elifidn}, and negative forms \i\c{%ifnidn} and \i\c{%elifnidn}.
  2325. Similarly, \c{%ifidni} has counterparts \i\c{%elifidni},
  2326. \i\c{%ifnidni} and \i\c{%elifnidni}.
  2327. \S{iftyp} \i\c{%ifid}, \i\c{%ifnum}, \i\c{%ifstr}: Testing Token
  2328. Types\I{testing, token types}
  2329. Some macros will want to perform different tasks depending on
  2330. whether they are passed a number, a string, or an identifier. For
  2331. example, a string output macro might want to be able to cope with
  2332. being passed either a string constant or a pointer to an existing
  2333. string.
  2334. The conditional assembly construct \c{%ifid}, taking one parameter
  2335. (which may be blank), assembles the subsequent code if and only if
  2336. the first token in the parameter exists and is an identifier.
  2337. \c{%ifnum} works similarly, but tests for the token being a numeric
  2338. constant; \c{%ifstr} tests for it being a string.
  2339. For example, the \c{writefile} macro defined in \k{mlmacgre} can be
  2340. extended to take advantage of \c{%ifstr} in the following fashion:
  2341. \c %macro writefile 2-3+
  2342. \c
  2343. \c %ifstr %2
  2344. \c jmp %%endstr
  2345. \c %if %0 = 3
  2346. \c %%str: db %2,%3
  2347. \c %else
  2348. \c %%str: db %2
  2349. \c %endif
  2350. \c %%endstr: mov dx,%%str
  2351. \c mov cx,%%endstr-%%str
  2352. \c %else
  2353. \c mov dx,%2
  2354. \c mov cx,%3
  2355. \c %endif
  2356. \c mov bx,%1
  2357. \c mov ah,0x40
  2358. \c int 0x21
  2359. \c
  2360. \c %endmacro
  2361. Then the \c{writefile} macro can cope with being called in either of
  2362. the following two ways:
  2363. \c writefile [file], strpointer, length
  2364. \c writefile [file], "hello", 13, 10
  2365. In the first, \c{strpointer} is used as the address of an
  2366. already-declared string, and \c{length} is used as its length; in
  2367. the second, a string is given to the macro, which therefore declares
  2368. it itself and works out the address and length for itself.
  2369. Note the use of \c{%if} inside the \c{%ifstr}: this is to detect
  2370. whether the macro was passed two arguments (so the string would be a
  2371. single string constant, and \c{db %2} would be adequate) or more (in
  2372. which case, all but the first two would be lumped together into
  2373. \c{%3}, and \c{db %2,%3} would be required).
  2374. The usual \I\c{%elifid}\I\c{%elifnum}\I\c{%elifstr}\c{%elif}...,
  2375. \I\c{%ifnid}\I\c{%ifnnum}\I\c{%ifnstr}\c{%ifn}..., and
  2376. \I\c{%elifnid}\I\c{%elifnnum}\I\c{%elifnstr}\c{%elifn}... versions
  2377. exist for each of \c{%ifid}, \c{%ifnum} and \c{%ifstr}.
  2378. \S{iftoken} \i\c{%iftoken}: Test for a Single Token
  2379. Some macros will want to do different things depending on if it is
  2380. passed a single token (e.g. paste it to something else using \c{%+})
  2381. versus a multi-token sequence.
  2382. The conditional assembly construct \c{%iftoken} assembles the
  2383. subsequent code if and only if the expanded parameters consist of
  2384. exactly one token, possibly surrounded by whitespace.
  2385. For example:
  2386. \c %iftoken 1
  2387. will assemble the subsequent code, but
  2388. \c %iftoken -1
  2389. will not, since \c{-1} contains two tokens: the unary minus operator
  2390. \c{-}, and the number \c{1}.
  2391. The usual \i\c{%eliftoken}, \i\c\{%ifntoken}, and \i\c{%elifntoken}
  2392. variants are also provided.
  2393. \S{ifempty} \i\c{%ifempty}: Test for Empty Expansion
  2394. The conditional assembly construct \c{%ifempty} assembles the
  2395. subsequent code if and only if the expanded parameters do not contain
  2396. any tokens at all, whitespace excepted.
  2397. The usual \i\c{%elifempty}, \i\c\{%ifnempty}, and \i\c{%elifnempty}
  2398. variants are also provided.
  2399. \S{ifenv} \i\c{%ifenv}: Test If Environment Variable Exists
  2400. The conditional assembly construct \c{%ifenv} assembles the
  2401. subsequent code if and only if the environment variable referenced by
  2402. the \c{%!}\e{variable} directive exists.
  2403. The usual \i\c{%elifenv}, \i\c\{%ifnenv}, and \i\c{%elifnenv}
  2404. variants are also provided.
  2405. Just as for \c{%!}\e{variable} the argument should be written as a
  2406. string if it contains characters that would not be legal in an
  2407. identifier. See \k{getenv}.
  2408. \H{rep} \i{Preprocessor Loops}\I{repeating code}: \i\c{%rep}
  2409. NASM's \c{TIMES} prefix, though useful, cannot be used to invoke a
  2410. multi-line macro multiple times, because it is processed by NASM
  2411. after macros have already been expanded. Therefore NASM provides
  2412. another form of loop, this time at the preprocessor level: \c{%rep}.
  2413. The directives \c{%rep} and \i\c{%endrep} (\c{%rep} takes a numeric
  2414. argument, which can be an expression; \c{%endrep} takes no
  2415. arguments) can be used to enclose a chunk of code, which is then
  2416. replicated as many times as specified by the preprocessor:
  2417. \c %assign i 0
  2418. \c %rep 64
  2419. \c inc word [table+2*i]
  2420. \c %assign i i+1
  2421. \c %endrep
  2422. This will generate a sequence of 64 \c{INC} instructions,
  2423. incrementing every word of memory from \c{[table]} to
  2424. \c{[table+126]}.
  2425. For more complex termination conditions, or to break out of a repeat
  2426. loop part way along, you can use the \i\c{%exitrep} directive to
  2427. terminate the loop, like this:
  2428. \c fibonacci:
  2429. \c %assign i 0
  2430. \c %assign j 1
  2431. \c %rep 100
  2432. \c %if j > 65535
  2433. \c %exitrep
  2434. \c %endif
  2435. \c dw j
  2436. \c %assign k j+i
  2437. \c %assign i j
  2438. \c %assign j k
  2439. \c %endrep
  2440. \c
  2441. \c fib_number equ ($-fibonacci)/2
  2442. This produces a list of all the Fibonacci numbers that will fit in
  2443. 16 bits. Note that a maximum repeat count must still be given to
  2444. \c{%rep}. This is to prevent the possibility of NASM getting into an
  2445. infinite loop in the preprocessor, which (on multitasking or
  2446. multi-user systems) would typically cause all the system memory to
  2447. be gradually used up and other applications to start crashing.
  2448. Note a maximum repeat count is limited by 62 bit number, though it
  2449. is hardly possible that you ever need anything bigger.
  2450. \H{files} Source Files and Dependencies
  2451. These commands allow you to split your sources into multiple files.
  2452. \S{include} \i\c{%include}: \i{Including Other Files}
  2453. Using, once again, a very similar syntax to the C preprocessor,
  2454. NASM's preprocessor lets you include other source files into your
  2455. code. This is done by the use of the \i\c{%include} directive:
  2456. \c %include "macros.mac"
  2457. will include the contents of the file \c{macros.mac} into the source
  2458. file containing the \c{%include} directive.
  2459. Include files are \I{searching for include files}searched for in the
  2460. current directory (the directory you're in when you run NASM, as
  2461. opposed to the location of the NASM executable or the location of
  2462. the source file), plus any directories specified on the NASM command
  2463. line using the \c{-i} option.
  2464. The standard C idiom for preventing a file being included more than
  2465. once is just as applicable in NASM: if the file \c{macros.mac} has
  2466. the form
  2467. \c %ifndef MACROS_MAC
  2468. \c %define MACROS_MAC
  2469. \c ; now define some macros
  2470. \c %endif
  2471. then including the file more than once will not cause errors,
  2472. because the second time the file is included nothing will happen
  2473. because the macro \c{MACROS_MAC} will already be defined.
  2474. You can force a file to be included even if there is no \c{%include}
  2475. directive that explicitly includes it, by using the \i\c{-p} option
  2476. on the NASM command line (see \k{opt-p}).
  2477. \S{pathsearch} \i\c{%pathsearch}: Search the Include Path
  2478. The \c{%pathsearch} directive takes a single-line macro name and a
  2479. filename, and declare or redefines the specified single-line macro to
  2480. be the include-path-resolved version of the filename, if the file
  2481. exists (otherwise, it is passed unchanged.)
  2482. For example,
  2483. \c %pathsearch MyFoo "foo.bin"
  2484. ... with \c{-Ibins/} in the include path may end up defining the macro
  2485. \c{MyFoo} to be \c{"bins/foo.bin"}.
  2486. \S{depend} \i\c{%depend}: Add Dependent Files
  2487. The \c{%depend} directive takes a filename and adds it to the list of
  2488. files to be emitted as dependency generation when the \c{-M} options
  2489. and its relatives (see \k{opt-M}) are used. It produces no output.
  2490. This is generally used in conjunction with \c{%pathsearch}. For
  2491. example, a simplified version of the standard macro wrapper for the
  2492. \c{INCBIN} directive looks like:
  2493. \c %imacro incbin 1-2+ 0
  2494. \c %pathsearch dep %1
  2495. \c %depend dep
  2496. \c incbin dep,%2
  2497. \c %endmacro
  2498. This first resolves the location of the file into the macro \c{dep},
  2499. then adds it to the dependency lists, and finally issues the
  2500. assembler-level \c{INCBIN} directive.
  2501. \S{use} \i\c{%use}: Include Standard Macro Package
  2502. The \c{%use} directive is similar to \c{%include}, but rather than
  2503. including the contents of a file, it includes a named standard macro
  2504. package. The standard macro packages are part of NASM, and are
  2505. described in \k{macropkg}.
  2506. Unlike the \c{%include} directive, package names for the \c{%use}
  2507. directive do not require quotes, but quotes are permitted. In NASM
  2508. 2.04 and 2.05 the unquoted form would be macro-expanded; this is no
  2509. longer true. Thus, the following lines are equivalent:
  2510. \c %use altreg
  2511. \c %use 'altreg'
  2512. Standard macro packages are protected from multiple inclusion. When a
  2513. standard macro package is used, a testable single-line macro of the
  2514. form \c{__USE_}\e{package}\c{__} is also defined, see \k{use_def}.
  2515. \H{ctxstack} The \i{Context Stack}
  2516. Having labels that are local to a macro definition is sometimes not
  2517. quite powerful enough: sometimes you want to be able to share labels
  2518. between several macro calls. An example might be a \c{REPEAT} ...
  2519. \c{UNTIL} loop, in which the expansion of the \c{REPEAT} macro
  2520. would need to be able to refer to a label which the \c{UNTIL} macro
  2521. had defined. However, for such a macro you would also want to be
  2522. able to nest these loops.
  2523. NASM provides this level of power by means of a \e{context stack}.
  2524. The preprocessor maintains a stack of \e{contexts}, each of which is
  2525. characterized by a name. You add a new context to the stack using
  2526. the \i\c{%push} directive, and remove one using \i\c{%pop}. You can
  2527. define labels that are local to a particular context on the stack.
  2528. \S{pushpop} \i\c{%push} and \i\c{%pop}: \I{creating
  2529. contexts}\I{removing contexts}Creating and Removing Contexts
  2530. The \c{%push} directive is used to create a new context and place it
  2531. on the top of the context stack. \c{%push} takes an optional argument,
  2532. which is the name of the context. For example:
  2533. \c %push foobar
  2534. This pushes a new context called \c{foobar} on the stack. You can have
  2535. several contexts on the stack with the same name: they can still be
  2536. distinguished. If no name is given, the context is unnamed (this is
  2537. normally used when both the \c{%push} and the \c{%pop} are inside a
  2538. single macro definition.)
  2539. The directive \c{%pop}, taking one optional argument, removes the top
  2540. context from the context stack and destroys it, along with any
  2541. labels associated with it. If an argument is given, it must match the
  2542. name of the current context, otherwise it will issue an error.
  2543. \S{ctxlocal} \i{Context-Local Labels}
  2544. Just as the usage \c{%%foo} defines a label which is local to the
  2545. particular macro call in which it is used, the usage \I{%$}\c{%$foo}
  2546. is used to define a label which is local to the context on the top
  2547. of the context stack. So the \c{REPEAT} and \c{UNTIL} example given
  2548. above could be implemented by means of:
  2549. \c %macro repeat 0
  2550. \c
  2551. \c %push repeat
  2552. \c %$begin:
  2553. \c
  2554. \c %endmacro
  2555. \c
  2556. \c %macro until 1
  2557. \c
  2558. \c j%-1 %$begin
  2559. \c %pop
  2560. \c
  2561. \c %endmacro
  2562. and invoked by means of, for example,
  2563. \c mov cx,string
  2564. \c repeat
  2565. \c add cx,3
  2566. \c scasb
  2567. \c until e
  2568. which would scan every fourth byte of a string in search of the byte
  2569. in \c{AL}.
  2570. If you need to define, or access, labels local to the context
  2571. \e{below} the top one on the stack, you can use \I{%$$}\c{%$$foo}, or
  2572. \c{%$$$foo} for the context below that, and so on.
  2573. \S{ctxdefine} \i{Context-Local Single-Line Macros}
  2574. NASM also allows you to define single-line macros which are local to
  2575. a particular context, in just the same way:
  2576. \c %define %$localmac 3
  2577. will define the single-line macro \c{%$localmac} to be local to the
  2578. top context on the stack. Of course, after a subsequent \c{%push},
  2579. it can then still be accessed by the name \c{%$$localmac}.
  2580. \S{ctxfallthrough} \i{Context Fall-Through Lookup} \e{(deprecated)}
  2581. Context fall-through lookup (automatic searching of outer contexts)
  2582. is a feature that was added in NASM version 0.98.03. Unfortunately,
  2583. this feature is unintuitive and can result in buggy code that would
  2584. have otherwise been prevented by NASM's error reporting. As a result,
  2585. this feature has been \e{deprecated}. NASM version 2.09 will issue a
  2586. warning when usage of this \e{deprecated} feature is detected. Starting
  2587. with NASM version 2.10, usage of this \e{deprecated} feature will simply
  2588. result in an \e{expression syntax error}.
  2589. An example usage of this \e{deprecated} feature follows:
  2590. \c %macro ctxthru 0
  2591. \c %push ctx1
  2592. \c %assign %$external 1
  2593. \c %push ctx2
  2594. \c %assign %$internal 1
  2595. \c mov eax, %$external
  2596. \c mov eax, %$internal
  2597. \c %pop
  2598. \c %pop
  2599. \c %endmacro
  2600. As demonstrated, \c{%$external} is being defined in the \c{ctx1}
  2601. context and referenced within the \c{ctx2} context. With context
  2602. fall-through lookup, referencing an undefined context-local macro
  2603. like this implicitly searches through all outer contexts until a match
  2604. is made or isn't found in any context. As a result, \c{%$external}
  2605. referenced within the \c{ctx2} context would implicitly use \c{%$external}
  2606. as defined in \c{ctx1}. Most people would expect NASM to issue an error in
  2607. this situation because \c{%$external} was never defined within \c{ctx2} and also
  2608. isn't qualified with the proper context depth, \c{%$$external}.
  2609. Here is a revision of the above example with proper context depth:
  2610. \c %macro ctxthru 0
  2611. \c %push ctx1
  2612. \c %assign %$external 1
  2613. \c %push ctx2
  2614. \c %assign %$internal 1
  2615. \c mov eax, %$$external
  2616. \c mov eax, %$internal
  2617. \c %pop
  2618. \c %pop
  2619. \c %endmacro
  2620. As demonstrated, \c{%$external} is still being defined in the \c{ctx1}
  2621. context and referenced within the \c{ctx2} context. However, the
  2622. reference to \c{%$external} within \c{ctx2} has been fully qualified with
  2623. the proper context depth, \c{%$$external}, and thus is no longer ambiguous,
  2624. unintuitive or erroneous.
  2625. \S{ctxrepl} \i\c{%repl}: \I{renaming contexts}Renaming a Context
  2626. If you need to change the name of the top context on the stack (in
  2627. order, for example, to have it respond differently to \c{%ifctx}),
  2628. you can execute a \c{%pop} followed by a \c{%push}; but this will
  2629. have the side effect of destroying all context-local labels and
  2630. macros associated with the context that was just popped.
  2631. NASM provides the directive \c{%repl}, which \e{replaces} a context
  2632. with a different name, without touching the associated macros and
  2633. labels. So you could replace the destructive code
  2634. \c %pop
  2635. \c %push newname
  2636. with the non-destructive version \c{%repl newname}.
  2637. \S{blockif} Example Use of the \i{Context Stack}: \i{Block IFs}
  2638. This example makes use of almost all the context-stack features,
  2639. including the conditional-assembly construct \i\c{%ifctx}, to
  2640. implement a block IF statement as a set of macros.
  2641. \c %macro if 1
  2642. \c
  2643. \c %push if
  2644. \c j%-1 %$ifnot
  2645. \c
  2646. \c %endmacro
  2647. \c
  2648. \c %macro else 0
  2649. \c
  2650. \c %ifctx if
  2651. \c %repl else
  2652. \c jmp %$ifend
  2653. \c %$ifnot:
  2654. \c %else
  2655. \c %error "expected `if' before `else'"
  2656. \c %endif
  2657. \c
  2658. \c %endmacro
  2659. \c
  2660. \c %macro endif 0
  2661. \c
  2662. \c %ifctx if
  2663. \c %$ifnot:
  2664. \c %pop
  2665. \c %elifctx else
  2666. \c %$ifend:
  2667. \c %pop
  2668. \c %else
  2669. \c %error "expected `if' or `else' before `endif'"
  2670. \c %endif
  2671. \c
  2672. \c %endmacro
  2673. This code is more robust than the \c{REPEAT} and \c{UNTIL} macros
  2674. given in \k{ctxlocal}, because it uses conditional assembly to check
  2675. that the macros are issued in the right order (for example, not
  2676. calling \c{endif} before \c{if}) and issues a \c{%error} if they're
  2677. not.
  2678. In addition, the \c{endif} macro has to be able to cope with the two
  2679. distinct cases of either directly following an \c{if}, or following
  2680. an \c{else}. It achieves this, again, by using conditional assembly
  2681. to do different things depending on whether the context on top of
  2682. the stack is \c{if} or \c{else}.
  2683. The \c{else} macro has to preserve the context on the stack, in
  2684. order to have the \c{%$ifnot} referred to by the \c{if} macro be the
  2685. same as the one defined by the \c{endif} macro, but has to change
  2686. the context's name so that \c{endif} will know there was an
  2687. intervening \c{else}. It does this by the use of \c{%repl}.
  2688. A sample usage of these macros might look like:
  2689. \c cmp ax,bx
  2690. \c
  2691. \c if ae
  2692. \c cmp bx,cx
  2693. \c
  2694. \c if ae
  2695. \c mov ax,cx
  2696. \c else
  2697. \c mov ax,bx
  2698. \c endif
  2699. \c
  2700. \c else
  2701. \c cmp ax,cx
  2702. \c
  2703. \c if ae
  2704. \c mov ax,cx
  2705. \c endif
  2706. \c
  2707. \c endif
  2708. The block-\c{IF} macros handle nesting quite happily, by means of
  2709. pushing another context, describing the inner \c{if}, on top of the
  2710. one describing the outer \c{if}; thus \c{else} and \c{endif} always
  2711. refer to the last unmatched \c{if} or \c{else}.
  2712. \H{stackrel} \i{Stack Relative Preprocessor Directives}
  2713. The following preprocessor directives provide a way to use
  2714. labels to refer to local variables allocated on the stack.
  2715. \b\c{%arg} (see \k{arg})
  2716. \b\c{%stacksize} (see \k{stacksize})
  2717. \b\c{%local} (see \k{local})
  2718. \S{arg} \i\c{%arg} Directive
  2719. The \c{%arg} directive is used to simplify the handling of
  2720. parameters passed on the stack. Stack based parameter passing
  2721. is used by many high level languages, including C, C++ and Pascal.
  2722. While NASM has macros which attempt to duplicate this
  2723. functionality (see \k{16cmacro}), the syntax is not particularly
  2724. convenient to use and is not TASM compatible. Here is an example
  2725. which shows the use of \c{%arg} without any external macros:
  2726. \c some_function:
  2727. \c
  2728. \c %push mycontext ; save the current context
  2729. \c %stacksize large ; tell NASM to use bp
  2730. \c %arg i:word, j_ptr:word
  2731. \c
  2732. \c mov ax,[i]
  2733. \c mov bx,[j_ptr]
  2734. \c add ax,[bx]
  2735. \c ret
  2736. \c
  2737. \c %pop ; restore original context
  2738. This is similar to the procedure defined in \k{16cmacro} and adds
  2739. the value in i to the value pointed to by j_ptr and returns the
  2740. sum in the ax register. See \k{pushpop} for an explanation of
  2741. \c{push} and \c{pop} and the use of context stacks.
  2742. \S{stacksize} \i\c{%stacksize} Directive
  2743. The \c{%stacksize} directive is used in conjunction with the
  2744. \c{%arg} (see \k{arg}) and the \c{%local} (see \k{local}) directives.
  2745. It tells NASM the default size to use for subsequent \c{%arg} and
  2746. \c{%local} directives. The \c{%stacksize} directive takes one
  2747. required argument which is one of \c{flat}, \c{flat64}, \c{large} or \c{small}.
  2748. \c %stacksize flat
  2749. This form causes NASM to use stack-based parameter addressing
  2750. relative to \c{ebp} and it assumes that a near form of call was used
  2751. to get to this label (i.e. that \c{eip} is on the stack).
  2752. \c %stacksize flat64
  2753. This form causes NASM to use stack-based parameter addressing
  2754. relative to \c{rbp} and it assumes that a near form of call was used
  2755. to get to this label (i.e. that \c{rip} is on the stack).
  2756. \c %stacksize large
  2757. This form uses \c{bp} to do stack-based parameter addressing and
  2758. assumes that a far form of call was used to get to this address
  2759. (i.e. that \c{ip} and \c{cs} are on the stack).
  2760. \c %stacksize small
  2761. This form also uses \c{bp} to address stack parameters, but it is
  2762. different from \c{large} because it also assumes that the old value
  2763. of bp is pushed onto the stack (i.e. it expects an \c{ENTER}
  2764. instruction). In other words, it expects that \c{bp}, \c{ip} and
  2765. \c{cs} are on the top of the stack, underneath any local space which
  2766. may have been allocated by \c{ENTER}. This form is probably most
  2767. useful when used in combination with the \c{%local} directive
  2768. (see \k{local}).
  2769. \S{local} \i\c{%local} Directive
  2770. The \c{%local} directive is used to simplify the use of local
  2771. temporary stack variables allocated in a stack frame. Automatic
  2772. local variables in C are an example of this kind of variable. The
  2773. \c{%local} directive is most useful when used with the \c{%stacksize}
  2774. (see \k{stacksize} and is also compatible with the \c{%arg} directive
  2775. (see \k{arg}). It allows simplified reference to variables on the
  2776. stack which have been allocated typically by using the \c{ENTER}
  2777. instruction.
  2778. \# (see \k{insENTER} for a description of that instruction).
  2779. An example of its use is the following:
  2780. \c silly_swap:
  2781. \c
  2782. \c %push mycontext ; save the current context
  2783. \c %stacksize small ; tell NASM to use bp
  2784. \c %assign %$localsize 0 ; see text for explanation
  2785. \c %local old_ax:word, old_dx:word
  2786. \c
  2787. \c enter %$localsize,0 ; see text for explanation
  2788. \c mov [old_ax],ax ; swap ax & bx
  2789. \c mov [old_dx],dx ; and swap dx & cx
  2790. \c mov ax,bx
  2791. \c mov dx,cx
  2792. \c mov bx,[old_ax]
  2793. \c mov cx,[old_dx]
  2794. \c leave ; restore old bp
  2795. \c ret ;
  2796. \c
  2797. \c %pop ; restore original context
  2798. The \c{%$localsize} variable is used internally by the
  2799. \c{%local} directive and \e{must} be defined within the
  2800. current context before the \c{%local} directive may be used.
  2801. Failure to do so will result in one expression syntax error for
  2802. each \c{%local} variable declared. It then may be used in
  2803. the construction of an appropriately sized ENTER instruction
  2804. as shown in the example.
  2805. \H{pperror} Reporting \i{User-Defined Errors}: \i\c{%error}, \i\c{%warning}, \i\c{%fatal}
  2806. The preprocessor directive \c{%error} will cause NASM to report an
  2807. error if it occurs in assembled code. So if other users are going to
  2808. try to assemble your source files, you can ensure that they define the
  2809. right macros by means of code like this:
  2810. \c %ifdef F1
  2811. \c ; do some setup
  2812. \c %elifdef F2
  2813. \c ; do some different setup
  2814. \c %else
  2815. \c %error "Neither F1 nor F2 was defined."
  2816. \c %endif
  2817. Then any user who fails to understand the way your code is supposed
  2818. to be assembled will be quickly warned of their mistake, rather than
  2819. having to wait until the program crashes on being run and then not
  2820. knowing what went wrong.
  2821. Similarly, \c{%warning} issues a warning, but allows assembly to continue:
  2822. \c %ifdef F1
  2823. \c ; do some setup
  2824. \c %elifdef F2
  2825. \c ; do some different setup
  2826. \c %else
  2827. \c %warning "Neither F1 nor F2 was defined, assuming F1."
  2828. \c %define F1
  2829. \c %endif
  2830. \c{%error} and \c{%warning} are issued only on the final assembly
  2831. pass. This makes them safe to use in conjunction with tests that
  2832. depend on symbol values.
  2833. \c{%fatal} terminates assembly immediately, regardless of pass. This
  2834. is useful when there is no point in continuing the assembly further,
  2835. and doing so is likely just going to cause a spew of confusing error
  2836. messages.
  2837. It is optional for the message string after \c{%error}, \c{%warning}
  2838. or \c{%fatal} to be quoted. If it is \e{not}, then single-line macros
  2839. are expanded in it, which can be used to display more information to
  2840. the user. For example:
  2841. \c %if foo > 64
  2842. \c %assign foo_over foo-64
  2843. \c %error foo is foo_over bytes too large
  2844. \c %endif
  2845. \H{otherpreproc} \i{Other Preprocessor Directives}
  2846. \S{line} \i\c{%line} Directive
  2847. The \c{%line} directive is used to notify NASM that the input line
  2848. corresponds to a specific line number in another file. Typically
  2849. this other file would be an original source file, with the current
  2850. NASM input being the output of a pre-processor. The \c{%line}
  2851. directive allows NASM to output messages which indicate the line
  2852. number of the original source file, instead of the file that is being
  2853. read by NASM.
  2854. This preprocessor directive is not generally used directly by
  2855. programmers, but may be of interest to preprocessor authors. The
  2856. usage of the \c{%line} preprocessor directive is as follows:
  2857. \c %line nnn[+mmm] [filename]
  2858. In this directive, \c{nnn} identifies the line of the original source
  2859. file which this line corresponds to. \c{mmm} is an optional parameter
  2860. which specifies a line increment value; each line of the input file
  2861. read in is considered to correspond to \c{mmm} lines of the original
  2862. source file. Finally, \c{filename} is an optional parameter which
  2863. specifies the file name of the original source file.
  2864. After reading a \c{%line} preprocessor directive, NASM will report
  2865. all file name and line numbers relative to the values specified
  2866. therein.
  2867. If the command line option \i\c{--no-line} is given, all \c{%line}
  2868. directives are ignored. This may be useful for debugging preprocessed
  2869. code. See \k{opt-no-line}.
  2870. \S{getenv} \i\c{%!}\e{variable}: Read an Environment Variable.
  2871. The \c{%!}\e{variable} directive makes it possible to read the value of an
  2872. environment variable at assembly time. This could, for example, be used
  2873. to store the contents of an environment variable into a string, which
  2874. could be used at some other point in your code.
  2875. For example, suppose that you have an environment variable \c{FOO},
  2876. and you want the contents of \c{FOO} to be embedded in your program as
  2877. a quoted string. You could do that as follows:
  2878. \c %defstr FOO %!FOO
  2879. See \k{defstr} for notes on the \c{%defstr} directive.
  2880. If the name of the environment variable contains non-identifier
  2881. characters, you can use string quotes to surround the name of the
  2882. variable, for example:
  2883. \c %defstr C_colon %!'C:'
  2884. \H{stdmac} \i{Standard Macros}
  2885. NASM defines a set of standard macros, which are already defined
  2886. when it starts to process any source file. If you really need a
  2887. program to be assembled with no pre-defined macros, you can use the
  2888. \i\c{%clear} directive to empty the preprocessor of everything but
  2889. context-local preprocessor variables and single-line macros.
  2890. Most \i{user-level assembler directives} (see \k{directive}) are
  2891. implemented as macros which invoke primitive directives; these are
  2892. described in \k{directive}. The rest of the standard macro set is
  2893. described here.
  2894. \S{stdmacver} \i{NASM Version} Macros
  2895. The single-line macros \i\c{__NASM_MAJOR__}, \i\c{__NASM_MINOR__},
  2896. \i\c{__NASM_SUBMINOR__} and \i\c{___NASM_PATCHLEVEL__} expand to the
  2897. major, minor, subminor and patch level parts of the \i{version
  2898. number of NASM} being used. So, under NASM 0.98.32p1 for
  2899. example, \c{__NASM_MAJOR__} would be defined to be 0, \c{__NASM_MINOR__}
  2900. would be defined as 98, \c{__NASM_SUBMINOR__} would be defined to 32,
  2901. and \c{___NASM_PATCHLEVEL__} would be defined as 1.
  2902. Additionally, the macro \i\c{__NASM_SNAPSHOT__} is defined for
  2903. automatically generated snapshot releases \e{only}.
  2904. \S{stdmacverid} \i\c{__NASM_VERSION_ID__}: \i{NASM Version ID}
  2905. The single-line macro \c{__NASM_VERSION_ID__} expands to a dword integer
  2906. representing the full version number of the version of nasm being used.
  2907. The value is the equivalent to \c{__NASM_MAJOR__}, \c{__NASM_MINOR__},
  2908. \c{__NASM_SUBMINOR__} and \c{___NASM_PATCHLEVEL__} concatenated to
  2909. produce a single doubleword. Hence, for 0.98.32p1, the returned number
  2910. would be equivalent to:
  2911. \c dd 0x00622001
  2912. or
  2913. \c db 1,32,98,0
  2914. Note that the above lines are generate exactly the same code, the second
  2915. line is used just to give an indication of the order that the separate
  2916. values will be present in memory.
  2917. \S{stdmacverstr} \i\c{__NASM_VER__}: \i{NASM Version string}
  2918. The single-line macro \c{__NASM_VER__} expands to a string which defines
  2919. the version number of nasm being used. So, under NASM 0.98.32 for example,
  2920. \c db __NASM_VER__
  2921. would expand to
  2922. \c db "0.98.32"
  2923. \S{fileline} \i\c{__FILE__} and \i\c{__LINE__}: File Name and Line Number
  2924. Like the C preprocessor, NASM allows the user to find out the file
  2925. name and line number containing the current instruction. The macro
  2926. \c{__FILE__} expands to a string constant giving the name of the
  2927. current input file (which may change through the course of assembly
  2928. if \c{%include} directives are used), and \c{__LINE__} expands to a
  2929. numeric constant giving the current line number in the input file.
  2930. These macros could be used, for example, to communicate debugging
  2931. information to a macro, since invoking \c{__LINE__} inside a macro
  2932. definition (either single-line or multi-line) will return the line
  2933. number of the macro \e{call}, rather than \e{definition}. So to
  2934. determine where in a piece of code a crash is occurring, for
  2935. example, one could write a routine \c{stillhere}, which is passed a
  2936. line number in \c{EAX} and outputs something like `line 155: still
  2937. here'. You could then write a macro
  2938. \c %macro notdeadyet 0
  2939. \c
  2940. \c push eax
  2941. \c mov eax,__LINE__
  2942. \c call stillhere
  2943. \c pop eax
  2944. \c
  2945. \c %endmacro
  2946. and then pepper your code with calls to \c{notdeadyet} until you
  2947. find the crash point.
  2948. \S{bitsm} \i\c{__BITS__}: Current BITS Mode
  2949. The \c{__BITS__} standard macro is updated every time that the BITS mode is
  2950. set using the \c{BITS XX} or \c{[BITS XX]} directive, where XX is a valid mode
  2951. number of 16, 32 or 64. \c{__BITS__} receives the specified mode number and
  2952. makes it globally available. This can be very useful for those who utilize
  2953. mode-dependent macros.
  2954. \S{ofmtm} \i\c{__OUTPUT_FORMAT__}: Current Output Format
  2955. The \c{__OUTPUT_FORMAT__} standard macro holds the current output
  2956. format name, as given by the \c{-f} option or NASM's default. Type
  2957. \c{nasm -hf} for a list.
  2958. \c %ifidn __OUTPUT_FORMAT__, win32
  2959. \c %define NEWLINE 13, 10
  2960. \c %elifidn __OUTPUT_FORMAT__, elf32
  2961. \c %define NEWLINE 10
  2962. \c %endif
  2963. \S{dfmtm} \i\c{__DEBUG_FORMAT__}: Current Debug Format
  2964. If debugging information generation is enabled, The
  2965. \c{__DEBUG_FORMAT__} standard macro holds the current debug format
  2966. name as specified by the \c{-F} or \c{-g} option or the output format
  2967. default. Type \c{nasm -f} \e{output} \c{y} for a list.
  2968. \c{__DEBUG_FORMAT__} is not defined if debugging is not enabled, or if
  2969. the debug format specified is \c{null}.
  2970. \S{datetime} Assembly Date and Time Macros
  2971. NASM provides a variety of macros that represent the timestamp of the
  2972. assembly session.
  2973. \b The \i\c{__DATE__} and \i\c{__TIME__} macros give the assembly date and
  2974. time as strings, in ISO 8601 format (\c{"YYYY-MM-DD"} and \c{"HH:MM:SS"},
  2975. respectively.)
  2976. \b The \i\c{__DATE_NUM__} and \i\c{__TIME_NUM__} macros give the assembly
  2977. date and time in numeric form; in the format \c{YYYYMMDD} and
  2978. \c{HHMMSS} respectively.
  2979. \b The \i\c{__UTC_DATE__} and \i\c{__UTC_TIME__} macros give the assembly
  2980. date and time in universal time (UTC) as strings, in ISO 8601 format
  2981. (\c{"YYYY-MM-DD"} and \c{"HH:MM:SS"}, respectively.) If the host
  2982. platform doesn't provide UTC time, these macros are undefined.
  2983. \b The \i\c{__UTC_DATE_NUM__} and \i\c{__UTC_TIME_NUM__} macros give the
  2984. assembly date and time universal time (UTC) in numeric form; in the
  2985. format \c{YYYYMMDD} and \c{HHMMSS} respectively. If the
  2986. host platform doesn't provide UTC time, these macros are
  2987. undefined.
  2988. \b The \c{__POSIX_TIME__} macro is defined as a number containing the
  2989. number of seconds since the POSIX epoch, 1 January 1970 00:00:00 UTC;
  2990. excluding any leap seconds. This is computed using UTC time if
  2991. available on the host platform, otherwise it is computed using the
  2992. local time as if it was UTC.
  2993. All instances of time and date macros in the same assembly session
  2994. produce consistent output. For example, in an assembly session
  2995. started at 42 seconds after midnight on January 1, 2010 in Moscow
  2996. (timezone UTC+3) these macros would have the following values,
  2997. assuming, of course, a properly configured environment with a correct
  2998. clock:
  2999. \c __DATE__ "2010-01-01"
  3000. \c __TIME__ "00:00:42"
  3001. \c __DATE_NUM__ 20100101
  3002. \c __TIME_NUM__ 000042
  3003. \c __UTC_DATE__ "2009-12-31"
  3004. \c __UTC_TIME__ "21:00:42"
  3005. \c __UTC_DATE_NUM__ 20091231
  3006. \c __UTC_TIME_NUM__ 210042
  3007. \c __POSIX_TIME__ 1262293242
  3008. \S{use_def} \I\c{__USE_*__}\c{__USE_}\e{package}\c{__}: Package
  3009. Include Test
  3010. When a standard macro package (see \k{macropkg}) is included with the
  3011. \c{%use} directive (see \k{use}), a single-line macro of the form
  3012. \c{__USE_}\e{package}\c{__} is automatically defined. This allows
  3013. testing if a particular package is invoked or not.
  3014. For example, if the \c{altreg} package is included (see
  3015. \k{pkg_altreg}), then the macro \c{__USE_ALTREG__} is defined.
  3016. \S{pass_macro} \i\c{__PASS__}: Assembly Pass
  3017. The macro \c{__PASS__} is defined to be \c{1} on preparatory passes,
  3018. and \c{2} on the final pass. In preprocess-only mode, it is set to
  3019. \c{3}, and when running only to generate dependencies (due to the
  3020. \c{-M} or \c{-MG} option, see \k{opt-M}) it is set to \c{0}.
  3021. \e{Avoid using this macro if at all possible. It is tremendously easy
  3022. to generate very strange errors by misusing it, and the semantics may
  3023. change in future versions of NASM.}
  3024. \S{struc} \i\c{STRUC} and \i\c{ENDSTRUC}: \i{Declaring Structure} Data Types
  3025. The core of NASM contains no intrinsic means of defining data
  3026. structures; instead, the preprocessor is sufficiently powerful that
  3027. data structures can be implemented as a set of macros. The macros
  3028. \c{STRUC} and \c{ENDSTRUC} are used to define a structure data type.
  3029. \c{STRUC} takes one or two parameters. The first parameter is the name
  3030. of the data type. The second, optional parameter is the base offset of
  3031. the structure. The name of the data type is defined as a symbol with
  3032. the value of the base offset, and the name of the data type with the
  3033. suffix \c{_size} appended to it is defined as an \c{EQU} giving the
  3034. size of the structure. Once \c{STRUC} has been issued, you are
  3035. defining the structure, and should define fields using the \c{RESB}
  3036. family of pseudo-instructions, and then invoke \c{ENDSTRUC} to finish
  3037. the definition.
  3038. For example, to define a structure called \c{mytype} containing a
  3039. longword, a word, a byte and a string of bytes, you might code
  3040. \c struc mytype
  3041. \c
  3042. \c mt_long: resd 1
  3043. \c mt_word: resw 1
  3044. \c mt_byte: resb 1
  3045. \c mt_str: resb 32
  3046. \c
  3047. \c endstruc
  3048. The above code defines six symbols: \c{mt_long} as 0 (the offset
  3049. from the beginning of a \c{mytype} structure to the longword field),
  3050. \c{mt_word} as 4, \c{mt_byte} as 6, \c{mt_str} as 7, \c{mytype_size}
  3051. as 39, and \c{mytype} itself as zero.
  3052. The reason why the structure type name is defined at zero by default
  3053. is a side effect of allowing structures to work with the local label
  3054. mechanism: if your structure members tend to have the same names in
  3055. more than one structure, you can define the above structure like this:
  3056. \c struc mytype
  3057. \c
  3058. \c .long: resd 1
  3059. \c .word: resw 1
  3060. \c .byte: resb 1
  3061. \c .str: resb 32
  3062. \c
  3063. \c endstruc
  3064. This defines the offsets to the structure fields as \c{mytype.long},
  3065. \c{mytype.word}, \c{mytype.byte} and \c{mytype.str}.
  3066. NASM, since it has no \e{intrinsic} structure support, does not
  3067. support any form of period notation to refer to the elements of a
  3068. structure once you have one (except the above local-label notation),
  3069. so code such as \c{mov ax,[mystruc.mt_word]} is not valid.
  3070. \c{mt_word} is a constant just like any other constant, so the
  3071. correct syntax is \c{mov ax,[mystruc+mt_word]} or \c{mov
  3072. ax,[mystruc+mytype.word]}.
  3073. Sometimes you only have the address of the structure displaced by an
  3074. offset. For example, consider this standard stack frame setup:
  3075. \c push ebp
  3076. \c mov ebp, esp
  3077. \c sub esp, 40
  3078. In this case, you could access an element by subtracting the offset:
  3079. \c mov [ebp - 40 + mytype.word], ax
  3080. However, if you do not want to repeat this offset, you can use -40 as
  3081. a base offset:
  3082. \c struc mytype, -40
  3083. And access an element this way:
  3084. \c mov [ebp + mytype.word], ax
  3085. \S{istruc} \i\c{ISTRUC}, \i\c{AT} and \i\c{IEND}: Declaring
  3086. \i{Instances of Structures}
  3087. Having defined a structure type, the next thing you typically want
  3088. to do is to declare instances of that structure in your data
  3089. segment. NASM provides an easy way to do this in the \c{ISTRUC}
  3090. mechanism. To declare a structure of type \c{mytype} in a program,
  3091. you code something like this:
  3092. \c mystruc:
  3093. \c istruc mytype
  3094. \c
  3095. \c at mt_long, dd 123456
  3096. \c at mt_word, dw 1024
  3097. \c at mt_byte, db 'x'
  3098. \c at mt_str, db 'hello, world', 13, 10, 0
  3099. \c
  3100. \c iend
  3101. The function of the \c{AT} macro is to make use of the \c{TIMES}
  3102. prefix to advance the assembly position to the correct point for the
  3103. specified structure field, and then to declare the specified data.
  3104. Therefore the structure fields must be declared in the same order as
  3105. they were specified in the structure definition.
  3106. If the data to go in a structure field requires more than one source
  3107. line to specify, the remaining source lines can easily come after
  3108. the \c{AT} line. For example:
  3109. \c at mt_str, db 123,134,145,156,167,178,189
  3110. \c db 190,100,0
  3111. Depending on personal taste, you can also omit the code part of the
  3112. \c{AT} line completely, and start the structure field on the next
  3113. line:
  3114. \c at mt_str
  3115. \c db 'hello, world'
  3116. \c db 13,10,0
  3117. \S{align} \i\c{ALIGN} and \i\c{ALIGNB}: Data Alignment
  3118. The \c{ALIGN} and \c{ALIGNB} macros provides a convenient way to
  3119. align code or data on a word, longword, paragraph or other boundary.
  3120. (Some assemblers call this directive \i\c{EVEN}.) The syntax of the
  3121. \c{ALIGN} and \c{ALIGNB} macros is
  3122. \c align 4 ; align on 4-byte boundary
  3123. \c align 16 ; align on 16-byte boundary
  3124. \c align 8,db 0 ; pad with 0s rather than NOPs
  3125. \c align 4,resb 1 ; align to 4 in the BSS
  3126. \c alignb 4 ; equivalent to previous line
  3127. Both macros require their first argument to be a power of two; they
  3128. both compute the number of additional bytes required to bring the
  3129. length of the current section up to a multiple of that power of two,
  3130. and then apply the \c{TIMES} prefix to their second argument to
  3131. perform the alignment.
  3132. If the second argument is not specified, the default for \c{ALIGN}
  3133. is \c{NOP}, and the default for \c{ALIGNB} is \c{RESB 1}. So if the
  3134. second argument is specified, the two macros are equivalent.
  3135. Normally, you can just use \c{ALIGN} in code and data sections and
  3136. \c{ALIGNB} in BSS sections, and never need the second argument
  3137. except for special purposes.
  3138. \c{ALIGN} and \c{ALIGNB}, being simple macros, perform no error
  3139. checking: they cannot warn you if their first argument fails to be a
  3140. power of two, or if their second argument generates more than one
  3141. byte of code. In each of these cases they will silently do the wrong
  3142. thing.
  3143. \c{ALIGNB} (or \c{ALIGN} with a second argument of \c{RESB 1}) can
  3144. be used within structure definitions:
  3145. \c struc mytype2
  3146. \c
  3147. \c mt_byte:
  3148. \c resb 1
  3149. \c alignb 2
  3150. \c mt_word:
  3151. \c resw 1
  3152. \c alignb 4
  3153. \c mt_long:
  3154. \c resd 1
  3155. \c mt_str:
  3156. \c resb 32
  3157. \c
  3158. \c endstruc
  3159. This will ensure that the structure members are sensibly aligned
  3160. relative to the base of the structure.
  3161. A final caveat: \c{ALIGN} and \c{ALIGNB} work relative to the
  3162. beginning of the \e{section}, not the beginning of the address space
  3163. in the final executable. Aligning to a 16-byte boundary when the
  3164. section you're in is only guaranteed to be aligned to a 4-byte
  3165. boundary, for example, is a waste of effort. Again, NASM does not
  3166. check that the section's alignment characteristics are sensible for
  3167. the use of \c{ALIGN} or \c{ALIGNB}.
  3168. Both \c{ALIGN} and \c{ALIGNB} do call \c{SECTALIGN} macro implicitly.
  3169. See \k{sectalign} for details.
  3170. See also the \c{smartalign} standard macro package, \k{pkg_smartalign}.
  3171. \S{sectalign} \i\c{SECTALIGN}: Section Alignment
  3172. The \c{SECTALIGN} macros provides a way to modify alignment attribute
  3173. of output file section. Unlike the \c{align=} attribute (which is allowed
  3174. at section definition only) the \c{SECTALIGN} macro may be used at any time.
  3175. For example the directive
  3176. \c SECTALIGN 16
  3177. sets the section alignment requirements to 16 bytes. Once increased it can
  3178. not be decreased, the magnitude may grow only.
  3179. Note that \c{ALIGN} (see \k{align}) calls the \c{SECTALIGN} macro implicitly
  3180. so the active section alignment requirements may be updated. This is by default
  3181. behaviour, if for some reason you want the \c{ALIGN} do not call \c{SECTALIGN}
  3182. at all use the directive
  3183. \c SECTALIGN OFF
  3184. It is still possible to turn in on again by
  3185. \c SECTALIGN ON
  3186. \C{macropkg} \i{Standard Macro Packages}
  3187. The \i\c{%use} directive (see \k{use}) includes one of the standard
  3188. macro packages included with the NASM distribution and compiled into
  3189. the NASM binary. It operates like the \c{%include} directive (see
  3190. \k{include}), but the included contents is provided by NASM itself.
  3191. The names of standard macro packages are case insensitive, and can be
  3192. quoted or not.
  3193. \H{pkg_altreg} \i\c{altreg}: \i{Alternate Register Names}
  3194. The \c{altreg} standard macro package provides alternate register
  3195. names. It provides numeric register names for all registers (not just
  3196. \c{R8}-\c{R15}), the Intel-defined aliases \c{R8L}-\c{R15L} for the
  3197. low bytes of register (as opposed to the NASM/AMD standard names
  3198. \c{R8B}-\c{R15B}), and the names \c{R0H}-\c{R3H} (by analogy with
  3199. \c{R0L}-\c{R3L}) for \c{AH}, \c{CH}, \c{DH}, and \c{BH}.
  3200. Example use:
  3201. \c %use altreg
  3202. \c
  3203. \c proc:
  3204. \c mov r0l,r3h ; mov al,bh
  3205. \c ret
  3206. See also \k{reg64}.
  3207. \H{pkg_smartalign} \i\c{smartalign}\I{align, smart}: Smart \c{ALIGN} Macro
  3208. The \c{smartalign} standard macro package provides for an \i\c{ALIGN}
  3209. macro which is more powerful than the default (and
  3210. backwards-compatible) one (see \k{align}). When the \c{smartalign}
  3211. package is enabled, when \c{ALIGN} is used without a second argument,
  3212. NASM will generate a sequence of instructions more efficient than a
  3213. series of \c{NOP}. Furthermore, if the padding exceeds a specific
  3214. threshold, then NASM will generate a jump over the entire padding
  3215. sequence.
  3216. The specific instructions generated can be controlled with the
  3217. new \i\c{ALIGNMODE} macro. This macro takes two parameters: one mode,
  3218. and an optional jump threshold override. If (for any reason) you need
  3219. to turn off the jump completely just set jump threshold value to -1
  3220. (or set it to \c{nojmp}). The following modes are possible:
  3221. \b \c{generic}: Works on all x86 CPUs and should have reasonable
  3222. performance. The default jump threshold is 8. This is the
  3223. default.
  3224. \b \c{nop}: Pad out with \c{NOP} instructions. The only difference
  3225. compared to the standard \c{ALIGN} macro is that NASM can still jump
  3226. over a large padding area. The default jump threshold is 16.
  3227. \b \c{k7}: Optimize for the AMD K7 (Athlon/Althon XP). These
  3228. instructions should still work on all x86 CPUs. The default jump
  3229. threshold is 16.
  3230. \b \c{k8}: Optimize for the AMD K8 (Opteron/Althon 64). These
  3231. instructions should still work on all x86 CPUs. The default jump
  3232. threshold is 16.
  3233. \b \c{p6}: Optimize for Intel CPUs. This uses the long \c{NOP}
  3234. instructions first introduced in Pentium Pro. This is incompatible
  3235. with all CPUs of family 5 or lower, as well as some VIA CPUs and
  3236. several virtualization solutions. The default jump threshold is 16.
  3237. The macro \i\c{__ALIGNMODE__} is defined to contain the current
  3238. alignment mode. A number of other macros beginning with \c{__ALIGN_}
  3239. are used internally by this macro package.
  3240. \H{pkg_fp} \i\c\{fp}: Floating-point macros
  3241. This packages contains the following floating-point convenience macros:
  3242. \c %define Inf __Infinity__
  3243. \c %define NaN __QNaN__
  3244. \c %define QNaN __QNaN__
  3245. \c %define SNaN __SNaN__
  3246. \c
  3247. \c %define float8(x) __float8__(x)
  3248. \c %define float16(x) __float16__(x)
  3249. \c %define float32(x) __float32__(x)
  3250. \c %define float64(x) __float64__(x)
  3251. \c %define float80m(x) __float80m__(x)
  3252. \c %define float80e(x) __float80e__(x)
  3253. \c %define float128l(x) __float128l__(x)
  3254. \c %define float128h(x) __float128h__(x)
  3255. \H{pkg_ifunc} \i\c{ifunc}: \i{Integer functions}
  3256. This package contains a set of macros which implement integer
  3257. functions. These are actually implemented as special operators, but
  3258. are most conveniently accessed via this macro package.
  3259. The macros provided are:
  3260. \S{ilog2} \i{Integer logarithms}
  3261. These functions calculate the integer logarithm base 2 of their
  3262. argument, considered as an unsigned integer. The only differences
  3263. between the functions is their respective behavior if the argument
  3264. provided is not a power of two.
  3265. The function \i\c{ilog2e()} (alias \i\c{ilog2()}) generates an error if
  3266. the argument is not a power of two.
  3267. The function \i\c{ilog2f()} rounds the argument down to the nearest
  3268. power of two; if the argument is zero it returns zero.
  3269. The function \i\c{ilog2c()} rounds the argument up to the nearest
  3270. power of two.
  3271. The functions \i\c{ilog2fw()} (alias \i\c{ilog2w()}) and
  3272. \i\c{ilog2cw()} generate a warning if the argument is not a power of
  3273. two, but otherwise behaves like \c{ilog2f()} and \c{ilog2c()},
  3274. respectively.
  3275. \C{directive} \i{Assembler Directives}
  3276. NASM, though it attempts to avoid the bureaucracy of assemblers like
  3277. MASM and TASM, is nevertheless forced to support a \e{few}
  3278. directives. These are described in this chapter.
  3279. NASM's directives come in two types: \I{user-level
  3280. directives}\e{user-level} directives and \I{primitive
  3281. directives}\e{primitive} directives. Typically, each directive has a
  3282. user-level form and a primitive form. In almost all cases, we
  3283. recommend that users use the user-level forms of the directives,
  3284. which are implemented as macros which call the primitive forms.
  3285. Primitive directives are enclosed in square brackets; user-level
  3286. directives are not.
  3287. In addition to the universal directives described in this chapter,
  3288. each object file format can optionally supply extra directives in
  3289. order to control particular features of that file format. These
  3290. \I{format-specific directives}\e{format-specific} directives are
  3291. documented along with the formats that implement them, in \k{outfmt}.
  3292. \H{bits} \i\c{BITS}: Specifying Target \i{Processor Mode}
  3293. The \c{BITS} directive specifies whether NASM should generate code
  3294. \I{16-bit mode, versus 32-bit mode}designed to run on a processor
  3295. operating in 16-bit mode, 32-bit mode or 64-bit mode. The syntax is
  3296. \c{BITS XX}, where XX is 16, 32 or 64.
  3297. In most cases, you should not need to use \c{BITS} explicitly. The
  3298. \c{aout}, \c{coff}, \c{elf}, \c{macho}, \c{win32} and \c{win64}
  3299. object formats, which are designed for use in 32-bit or 64-bit
  3300. operating systems, all cause NASM to select 32-bit or 64-bit mode,
  3301. respectively, by default. The \c{obj} object format allows you
  3302. to specify each segment you define as either \c{USE16} or \c{USE32},
  3303. and NASM will set its operating mode accordingly, so the use of the
  3304. \c{BITS} directive is once again unnecessary.
  3305. The most likely reason for using the \c{BITS} directive is to write
  3306. 32-bit or 64-bit code in a flat binary file; this is because the \c{bin}
  3307. output format defaults to 16-bit mode in anticipation of it being
  3308. used most frequently to write DOS \c{.COM} programs, DOS \c{.SYS}
  3309. device drivers and boot loader software.
  3310. The \c{BITS} directive can also be used to generate code for a
  3311. different mode than the standard one for the output format.
  3312. You do \e{not} need to specify \c{BITS 32} merely in order to use
  3313. 32-bit instructions in a 16-bit DOS program; if you do, the
  3314. assembler will generate incorrect code because it will be writing
  3315. code targeted at a 32-bit platform, to be run on a 16-bit one.
  3316. When NASM is in \c{BITS 16} mode, instructions which use 32-bit
  3317. data are prefixed with an 0x66 byte, and those referring to 32-bit
  3318. addresses have an 0x67 prefix. In \c{BITS 32} mode, the reverse is
  3319. true: 32-bit instructions require no prefixes, whereas instructions
  3320. using 16-bit data need an 0x66 and those working on 16-bit addresses
  3321. need an 0x67.
  3322. When NASM is in \c{BITS 64} mode, most instructions operate the same
  3323. as they do for \c{BITS 32} mode. However, there are 8 more general and
  3324. SSE registers, and 16-bit addressing is no longer supported.
  3325. The default address size is 64 bits; 32-bit addressing can be selected
  3326. with the 0x67 prefix. The default operand size is still 32 bits,
  3327. however, and the 0x66 prefix selects 16-bit operand size. The \c{REX}
  3328. prefix is used both to select 64-bit operand size, and to access the
  3329. new registers. NASM automatically inserts REX prefixes when
  3330. necessary.
  3331. When the \c{REX} prefix is used, the processor does not know how to
  3332. address the AH, BH, CH or DH (high 8-bit legacy) registers. Instead,
  3333. it is possible to access the the low 8-bits of the SP, BP SI and DI
  3334. registers as SPL, BPL, SIL and DIL, respectively; but only when the
  3335. REX prefix is used.
  3336. The \c{BITS} directive has an exactly equivalent primitive form,
  3337. \c{[BITS 16]}, \c{[BITS 32]} and \c{[BITS 64]}. The user-level form is
  3338. a macro which has no function other than to call the primitive form.
  3339. Note that the space is neccessary, e.g. \c{BITS32} will \e{not} work!
  3340. \S{USE16 & USE32} \i\c{USE16} & \i\c{USE32}: Aliases for BITS
  3341. The `\c{USE16}' and `\c{USE32}' directives can be used in place of
  3342. `\c{BITS 16}' and `\c{BITS 32}', for compatibility with other assemblers.
  3343. \H{default} \i\c{DEFAULT}: Change the assembler defaults
  3344. The \c{DEFAULT} directive changes the assembler defaults. Normally,
  3345. NASM defaults to a mode where the programmer is expected to explicitly
  3346. specify most features directly. However, this is occasionally
  3347. obnoxious, as the explicit form is pretty much the only one one wishes
  3348. to use.
  3349. Currently, \c{DEFAULT} can set \c{REL} & \c{ABS} and \c{BND} & \c{NOBND}.
  3350. \S{REL & ABS} \i\c{REL} & \i\c{ABS}: RIP-relative addressing
  3351. This sets whether registerless instructions in 64-bit mode are \c{RIP}-relative
  3352. or not. By default, they are absolute unless overridden with the \i\c{REL}
  3353. specifier (see \k{effaddr}). However, if \c{DEFAULT REL} is
  3354. specified, \c{REL} is default, unless overridden with the \c{ABS}
  3355. specifier, \e{except when used with an FS or GS segment override}.
  3356. The special handling of \c{FS} and \c{GS} overrides are due to the
  3357. fact that these registers are generally used as thread pointers or
  3358. other special functions in 64-bit mode, and generating
  3359. \c{RIP}-relative addresses would be extremely confusing.
  3360. \c{DEFAULT REL} is disabled with \c{DEFAULT ABS}.
  3361. \S{BND & NOBND} \i\c{BND} & \i\c{NOBND}: \c{BND} prefix
  3362. If \c{DEFAULT BND} is set, all bnd-prefix available instructions following
  3363. this directive are prefixed with bnd. To override it, \c{NOBND} prefix can
  3364. be used.
  3365. \c DEFAULT BND
  3366. \c call foo ; BND will be prefixed
  3367. \c nobnd call foo ; BND will NOT be prefixed
  3368. \c{DEFAULT NOBND} can disable \c{DEFAULT BND} and then \c{BND} prefix will be
  3369. added only when explicitly specified in code.
  3370. \c{DEFAULT BND} is expected to be the normal configuration for writing
  3371. MPX-enabled code.
  3372. \H{section} \i\c{SECTION} or \i\c{SEGMENT}: Changing and \i{Defining
  3373. Sections}
  3374. \I{changing sections}\I{switching between sections}The \c{SECTION}
  3375. directive (\c{SEGMENT} is an exactly equivalent synonym) changes
  3376. which section of the output file the code you write will be
  3377. assembled into. In some object file formats, the number and names of
  3378. sections are fixed; in others, the user may make up as many as they
  3379. wish. Hence \c{SECTION} may sometimes give an error message, or may
  3380. define a new section, if you try to switch to a section that does
  3381. not (yet) exist.
  3382. The Unix object formats, and the \c{bin} object format (but see
  3383. \k{multisec}), all support
  3384. the \i{standardized section names} \c{.text}, \c{.data} and \c{.bss}
  3385. for the code, data and uninitialized-data sections. The \c{obj}
  3386. format, by contrast, does not recognize these section names as being
  3387. special, and indeed will strip off the leading period of any section
  3388. name that has one.
  3389. \S{sectmac} The \i\c{__SECT__} Macro
  3390. The \c{SECTION} directive is unusual in that its user-level form
  3391. functions differently from its primitive form. The primitive form,
  3392. \c{[SECTION xyz]}, simply switches the current target section to the
  3393. one given. The user-level form, \c{SECTION xyz}, however, first
  3394. defines the single-line macro \c{__SECT__} to be the primitive
  3395. \c{[SECTION]} directive which it is about to issue, and then issues
  3396. it. So the user-level directive
  3397. \c SECTION .text
  3398. expands to the two lines
  3399. \c %define __SECT__ [SECTION .text]
  3400. \c [SECTION .text]
  3401. Users may find it useful to make use of this in their own macros.
  3402. For example, the \c{writefile} macro defined in \k{mlmacgre} can be
  3403. usefully rewritten in the following more sophisticated form:
  3404. \c %macro writefile 2+
  3405. \c
  3406. \c [section .data]
  3407. \c
  3408. \c %%str: db %2
  3409. \c %%endstr:
  3410. \c
  3411. \c __SECT__
  3412. \c
  3413. \c mov dx,%%str
  3414. \c mov cx,%%endstr-%%str
  3415. \c mov bx,%1
  3416. \c mov ah,0x40
  3417. \c int 0x21
  3418. \c
  3419. \c %endmacro
  3420. This form of the macro, once passed a string to output, first
  3421. switches temporarily to the data section of the file, using the
  3422. primitive form of the \c{SECTION} directive so as not to modify
  3423. \c{__SECT__}. It then declares its string in the data section, and
  3424. then invokes \c{__SECT__} to switch back to \e{whichever} section
  3425. the user was previously working in. It thus avoids the need, in the
  3426. previous version of the macro, to include a \c{JMP} instruction to
  3427. jump over the data, and also does not fail if, in a complicated
  3428. \c{OBJ} format module, the user could potentially be assembling the
  3429. code in any of several separate code sections.
  3430. \H{absolute} \i\c{ABSOLUTE}: Defining Absolute Labels
  3431. The \c{ABSOLUTE} directive can be thought of as an alternative form
  3432. of \c{SECTION}: it causes the subsequent code to be directed at no
  3433. physical section, but at the hypothetical section starting at the
  3434. given absolute address. The only instructions you can use in this
  3435. mode are the \c{RESB} family.
  3436. \c{ABSOLUTE} is used as follows:
  3437. \c absolute 0x1A
  3438. \c
  3439. \c kbuf_chr resw 1
  3440. \c kbuf_free resw 1
  3441. \c kbuf resw 16
  3442. This example describes a section of the PC BIOS data area, at
  3443. segment address 0x40: the above code defines \c{kbuf_chr} to be
  3444. 0x1A, \c{kbuf_free} to be 0x1C, and \c{kbuf} to be 0x1E.
  3445. The user-level form of \c{ABSOLUTE}, like that of \c{SECTION},
  3446. redefines the \i\c{__SECT__} macro when it is invoked.
  3447. \i\c{STRUC} and \i\c{ENDSTRUC} are defined as macros which use
  3448. \c{ABSOLUTE} (and also \c{__SECT__}).
  3449. \c{ABSOLUTE} doesn't have to take an absolute constant as an
  3450. argument: it can take an expression (actually, a \i{critical
  3451. expression}: see \k{crit}) and it can be a value in a segment. For
  3452. example, a TSR can re-use its setup code as run-time BSS like this:
  3453. \c org 100h ; it's a .COM program
  3454. \c
  3455. \c jmp setup ; setup code comes last
  3456. \c
  3457. \c ; the resident part of the TSR goes here
  3458. \c setup:
  3459. \c ; now write the code that installs the TSR here
  3460. \c
  3461. \c absolute setup
  3462. \c
  3463. \c runtimevar1 resw 1
  3464. \c runtimevar2 resd 20
  3465. \c
  3466. \c tsr_end:
  3467. This defines some variables `on top of' the setup code, so that
  3468. after the setup has finished running, the space it took up can be
  3469. re-used as data storage for the running TSR. The symbol `tsr_end'
  3470. can be used to calculate the total size of the part of the TSR that
  3471. needs to be made resident.
  3472. \H{extern} \i\c{EXTERN}: \i{Importing Symbols} from Other Modules
  3473. \c{EXTERN} is similar to the MASM directive \c{EXTRN} and the C
  3474. keyword \c{extern}: it is used to declare a symbol which is not
  3475. defined anywhere in the module being assembled, but is assumed to be
  3476. defined in some other module and needs to be referred to by this
  3477. one. Not every object-file format can support external variables:
  3478. the \c{bin} format cannot.
  3479. The \c{EXTERN} directive takes as many arguments as you like. Each
  3480. argument is the name of a symbol:
  3481. \c extern _printf
  3482. \c extern _sscanf,_fscanf
  3483. Some object-file formats provide extra features to the \c{EXTERN}
  3484. directive. In all cases, the extra features are used by suffixing a
  3485. colon to the symbol name followed by object-format specific text.
  3486. For example, the \c{obj} format allows you to declare that the
  3487. default segment base of an external should be the group \c{dgroup}
  3488. by means of the directive
  3489. \c extern _variable:wrt dgroup
  3490. The primitive form of \c{EXTERN} differs from the user-level form
  3491. only in that it can take only one argument at a time: the support
  3492. for multiple arguments is implemented at the preprocessor level.
  3493. You can declare the same variable as \c{EXTERN} more than once: NASM
  3494. will quietly ignore the second and later redeclarations.
  3495. If a variable is declared both \c{GLOBAL} and \c{EXTERN}, or if it is
  3496. declared as \c{EXTERN} and then defined, it will be treated as
  3497. \c{GLOBAL}. If a variable is declared both as \c{COMMON} and
  3498. \c{EXTERN}, it will be treated as \c{COMMON}.
  3499. \H{global} \i\c{GLOBAL}: \i{Exporting Symbols} to Other Modules
  3500. \c{GLOBAL} is the other end of \c{EXTERN}: if one module declares a
  3501. symbol as \c{EXTERN} and refers to it, then in order to prevent
  3502. linker errors, some other module must actually \e{define} the
  3503. symbol and declare it as \c{GLOBAL}. Some assemblers use the name
  3504. \i\c{PUBLIC} for this purpose.
  3505. \c{GLOBAL} uses the same syntax as \c{EXTERN}, except that it must
  3506. refer to symbols which \e{are} defined in the same module as the
  3507. \c{GLOBAL} directive. For example:
  3508. \c global _main
  3509. \c _main:
  3510. \c ; some code
  3511. \c{GLOBAL}, like \c{EXTERN}, allows object formats to define private
  3512. extensions by means of a colon. The \c{elf} object format, for
  3513. example, lets you specify whether global data items are functions or
  3514. data:
  3515. \c global hashlookup:function, hashtable:data
  3516. Like \c{EXTERN}, the primitive form of \c{GLOBAL} differs from the
  3517. user-level form only in that it can take only one argument at a
  3518. time.
  3519. \H{common} \i\c{COMMON}: Defining Common Data Areas
  3520. The \c{COMMON} directive is used to declare \i\e{common variables}.
  3521. A common variable is much like a global variable declared in the
  3522. uninitialized data section, so that
  3523. \c common intvar 4
  3524. is similar in function to
  3525. \c global intvar
  3526. \c section .bss
  3527. \c
  3528. \c intvar resd 1
  3529. The difference is that if more than one module defines the same
  3530. common variable, then at link time those variables will be
  3531. \e{merged}, and references to \c{intvar} in all modules will point
  3532. at the same piece of memory.
  3533. Like \c{GLOBAL} and \c{EXTERN}, \c{COMMON} supports object-format
  3534. specific extensions. For example, the \c{obj} format allows common
  3535. variables to be NEAR or FAR, and the \c{elf} format allows you to
  3536. specify the alignment requirements of a common variable:
  3537. \c common commvar 4:near ; works in OBJ
  3538. \c common intarray 100:4 ; works in ELF: 4 byte aligned
  3539. Once again, like \c{EXTERN} and \c{GLOBAL}, the primitive form of
  3540. \c{COMMON} differs from the user-level form only in that it can take
  3541. only one argument at a time.
  3542. \H{static} \i\c{STATIC}: Local Symbols within Modules
  3543. Opposite to \c{EXTERN} and \c{GLOBAL}, \c{STATIC} is local symbol, but
  3544. should be named according to the global mangling rules (named by
  3545. analogy with the C keyword \c{static} as applied to functions or
  3546. global variables).
  3547. \c static foo
  3548. \c foo:
  3549. \c ; codes
  3550. Unlike \c{GLOBAL}, \c{STATIC} does not allow object formats to accept
  3551. private extensions mentioned in \k{global}.
  3552. \H{mangling} \i\c{(G|L)PREFIX}, \i\c{(G|L)POSTFIX}: Mangling Symbols
  3553. \c{PREFIX}, \c{GPREFIX}, \c{LPREFIX}, \c{POSTFIX}, \c{GPOSTFIX}, and
  3554. \c{LPOSTFIX} directives can prepend or append the given argument to
  3555. a certain type of symbols. The directive should be as a preprocess
  3556. statement. Each usage is:
  3557. \b\c{PREFIX}|\c{GPREFIX}: Prepend the argument to all \c{EXTERN}
  3558. \c{COMMON}, \c{STATIC}, and \c{GLOBAL} symbols
  3559. \b\c{LPREFIX}: Prepend the argument to all other symbols
  3560. such as Local Labels, and backend defined symbols
  3561. \b\c{POSTFIX}|\c{GPOSTFIX}: Append the argument to all \c{EXTERN}
  3562. \c{COMMON}, \c{STATIC}, and \c{GLOBAL} symbols
  3563. \b\c{LPOSTFIX}: Append the argument to all other symbols
  3564. such as Local Labels, and backend defined symbols
  3565. This is a macro implemented as a \c{%pragma}:
  3566. \c %pragma macho lprefix L_
  3567. Commandline option is also possible. See also \k{opt-pfix}.
  3568. Some toolchains is aware of a particular prefix for its own optimization
  3569. options, such as code elimination. For instance, Mach-O backend has a
  3570. linker that uses a simplistic naming scheme to chunk up sections into a
  3571. meta section. When the \c{subsections_via_symbols} directive
  3572. (\k{macho-ssvs}) is declared, each symbol is the start of a
  3573. separate block. The meta section is, then, defined to include sections
  3574. before the one that starts with a 'L'. \c{LPREFIX} is useful here to mark
  3575. all local symbols with the 'L' prefix to be excluded to the meta section.
  3576. It converts local symbols compatible with the particular toolchain.
  3577. Note that local symbols declared with \c{STATIC} (\k{static})
  3578. are excluded from the symbol mangling and also not marked as global.
  3579. \H{gen-namespace} \i\c{OUTPUT}, \i\c{DEBUG}: Generic Namespaces
  3580. \c{OUTPUT} and \c{DEBUG} are generic \c{%pragma} namespaces that are
  3581. supposed to redirect to the current output and debug formats.
  3582. For example, when mangling local symbols via the generic namespace:
  3583. \c %pragma output gprefix _
  3584. This is useful when the directive is needed to be output format
  3585. agnostic.
  3586. The example is also euquivalent to this, when the output format is \c{elf}:
  3587. \c %pragma elf gprefix _
  3588. \H{CPU} \i\c{CPU}: Defining CPU Dependencies
  3589. The \i\c{CPU} directive restricts assembly to those instructions which
  3590. are available on the specified CPU.
  3591. Options are:
  3592. \b\c{CPU 8086} Assemble only 8086 instruction set
  3593. \b\c{CPU 186} Assemble instructions up to the 80186 instruction set
  3594. \b\c{CPU 286} Assemble instructions up to the 286 instruction set
  3595. \b\c{CPU 386} Assemble instructions up to the 386 instruction set
  3596. \b\c{CPU 486} 486 instruction set
  3597. \b\c{CPU 586} Pentium instruction set
  3598. \b\c{CPU PENTIUM} Same as 586
  3599. \b\c{CPU 686} P6 instruction set
  3600. \b\c{CPU PPRO} Same as 686
  3601. \b\c{CPU P2} Same as 686
  3602. \b\c{CPU P3} Pentium III (Katmai) instruction sets
  3603. \b\c{CPU KATMAI} Same as P3
  3604. \b\c{CPU P4} Pentium 4 (Willamette) instruction set
  3605. \b\c{CPU WILLAMETTE} Same as P4
  3606. \b\c{CPU PRESCOTT} Prescott instruction set
  3607. \b\c{CPU X64} x86-64 (x64/AMD64/Intel 64) instruction set
  3608. \b\c{CPU IA64} IA64 CPU (in x86 mode) instruction set
  3609. All options are case insensitive. All instructions will be selected
  3610. only if they apply to the selected CPU or lower. By default, all
  3611. instructions are available.
  3612. \H{FLOAT} \i\c{FLOAT}: Handling of \I{floating-point, constants}floating-point constants
  3613. By default, floating-point constants are rounded to nearest, and IEEE
  3614. denormals are supported. The following options can be set to alter
  3615. this behaviour:
  3616. \b\c{FLOAT DAZ} Flush denormals to zero
  3617. \b\c{FLOAT NODAZ} Do not flush denormals to zero (default)
  3618. \b\c{FLOAT NEAR} Round to nearest (default)
  3619. \b\c{FLOAT UP} Round up (toward +Infinity)
  3620. \b\c{FLOAT DOWN} Round down (toward -Infinity)
  3621. \b\c{FLOAT ZERO} Round toward zero
  3622. \b\c{FLOAT DEFAULT} Restore default settings
  3623. The standard macros \i\c{__FLOAT_DAZ__}, \i\c{__FLOAT_ROUND__}, and
  3624. \i\c{__FLOAT__} contain the current state, as long as the programmer
  3625. has avoided the use of the brackeded primitive form, (\c{[FLOAT]}).
  3626. \c{__FLOAT__} contains the full set of floating-point settings; this
  3627. value can be saved away and invoked later to restore the setting.
  3628. \H{asmdir-warning} \i\c{[WARNING]}: Enable or disable warnings
  3629. The \c{[WARNING]} directive can be used to enable or disable classes
  3630. of warnings in the same way as the \c{-w} option, see \k{opt-w} for
  3631. more details about warning classes.
  3632. \b \c{[warning +}\e{warning-class}\c{]} enables warnings for
  3633. \e{warning-class}.
  3634. \b \c{[warning -}\e{warning-class}\c{]} disables warnings for
  3635. \e{warning-class}.
  3636. \b \c{[warning *}\e{warning-class}\c{]} restores \e{warning-class} to
  3637. the original value, either the default value or as specified on the
  3638. command line.
  3639. The \c{[WARNING]} directive also accepts the \c{all}, \c{error} and
  3640. \c{error=}\e{warning-class} specifiers.
  3641. No "user form" (without the brackets) currently exists.
  3642. \C{outfmt} \i{Output Formats}
  3643. NASM is a portable assembler, designed to be able to compile on any
  3644. ANSI C-supporting platform and produce output to run on a variety of
  3645. Intel x86 operating systems. For this reason, it has a large number
  3646. of available output formats, selected using the \i\c{-f} option on
  3647. the NASM \i{command line}. Each of these formats, along with its
  3648. extensions to the base NASM syntax, is detailed in this chapter.
  3649. As stated in \k{opt-o}, NASM chooses a \i{default name} for your
  3650. output file based on the input file name and the chosen output
  3651. format. This will be generated by removing the \i{extension}
  3652. (\c{.asm}, \c{.s}, or whatever you like to use) from the input file
  3653. name, and substituting an extension defined by the output format.
  3654. The extensions are given with each format below.
  3655. \H{binfmt} \i\c{bin}: \i{Flat-Form Binary}\I{pure binary} Output
  3656. The \c{bin} format does not produce object files: it generates
  3657. nothing in the output file except the code you wrote. Such `pure
  3658. binary' files are used by \i{MS-DOS}: \i\c{.COM} executables and
  3659. \i\c{.SYS} device drivers are pure binary files. Pure binary output
  3660. is also useful for \i{operating system} and \i{boot loader}
  3661. development.
  3662. The \c{bin} format supports \i{multiple section names}. For details of
  3663. how NASM handles sections in the \c{bin} format, see \k{multisec}.
  3664. Using the \c{bin} format puts NASM by default into 16-bit mode (see
  3665. \k{bits}). In order to use \c{bin} to write 32-bit or 64-bit code,
  3666. such as an OS kernel, you need to explicitly issue the \I\c{BITS}\c{BITS 32}
  3667. or \I\c{BITS}\c{BITS 64} directive.
  3668. \c{bin} has no default output file name extension: instead, it
  3669. leaves your file name as it is once the original extension has been
  3670. removed. Thus, the default is for NASM to assemble \c{binprog.asm}
  3671. into a binary file called \c{binprog}.
  3672. \S{org} \i\c{ORG}: Binary File \i{Program Origin}
  3673. The \c{bin} format provides an additional directive to the list
  3674. given in \k{directive}: \c{ORG}. The function of the \c{ORG}
  3675. directive is to specify the origin address which NASM will assume
  3676. the program begins at when it is loaded into memory.
  3677. For example, the following code will generate the longword
  3678. \c{0x00000104}:
  3679. \c org 0x100
  3680. \c dd label
  3681. \c label:
  3682. Unlike the \c{ORG} directive provided by MASM-compatible assemblers,
  3683. which allows you to jump around in the object file and overwrite
  3684. code you have already generated, NASM's \c{ORG} does exactly what
  3685. the directive says: \e{origin}. Its sole function is to specify one
  3686. offset which is added to all internal address references within the
  3687. section; it does not permit any of the trickery that MASM's version
  3688. does. See \k{proborg} for further comments.
  3689. \S{binseg} \c{bin} Extensions to the \c{SECTION}
  3690. Directive\I{SECTION, bin extensions to}
  3691. The \c{bin} output format extends the \c{SECTION} (or \c{SEGMENT})
  3692. directive to allow you to specify the alignment requirements of
  3693. segments. This is done by appending the \i\c{ALIGN} qualifier to the
  3694. end of the section-definition line. For example,
  3695. \c section .data align=16
  3696. switches to the section \c{.data} and also specifies that it must be
  3697. aligned on a 16-byte boundary.
  3698. The parameter to \c{ALIGN} specifies how many low bits of the
  3699. section start address must be forced to zero. The alignment value
  3700. given may be any power of two.\I{section alignment, in
  3701. bin}\I{segment alignment, in bin}\I{alignment, in bin sections}
  3702. \S{multisec} \i{Multisection}\I{bin, multisection} Support for the \c{bin} Format
  3703. The \c{bin} format allows the use of multiple sections, of arbitrary names,
  3704. besides the "known" \c{.text}, \c{.data}, and \c{.bss} names.
  3705. \b Sections may be designated \i\c{progbits} or \i\c{nobits}. Default
  3706. is \c{progbits} (except \c{.bss}, which defaults to \c{nobits},
  3707. of course).
  3708. \b Sections can be aligned at a specified boundary following the previous
  3709. section with \c{align=}, or at an arbitrary byte-granular position with
  3710. \i\c{start=}.
  3711. \b Sections can be given a virtual start address, which will be used
  3712. for the calculation of all memory references within that section
  3713. with \i\c{vstart=}.
  3714. \b Sections can be ordered using \i\c{follows=}\c{<section>} or
  3715. \i\c{vfollows=}\c{<section>} as an alternative to specifying an explicit
  3716. start address.
  3717. \b Arguments to \c{org}, \c{start}, \c{vstart}, and \c{align=} are
  3718. critical expressions. See \k{crit}. E.g. \c{align=(1 << ALIGN_SHIFT)}
  3719. - \c{ALIGN_SHIFT} must be defined before it is used here.
  3720. \b Any code which comes before an explicit \c{SECTION} directive
  3721. is directed by default into the \c{.text} section.
  3722. \b If an \c{ORG} statement is not given, \c{ORG 0} is used
  3723. by default.
  3724. \b The \c{.bss} section will be placed after the last \c{progbits}
  3725. section, unless \c{start=}, \c{vstart=}, \c{follows=}, or \c{vfollows=}
  3726. has been specified.
  3727. \b All sections are aligned on dword boundaries, unless a different
  3728. alignment has been specified.
  3729. \b Sections may not overlap.
  3730. \b NASM creates the \c{section.<secname>.start} for each section,
  3731. which may be used in your code.
  3732. \S{map}\i{Map Files}
  3733. Map files can be generated in \c{-f bin} format by means of the \c{[map]}
  3734. option. Map types of \c{all} (default), \c{brief}, \c{sections}, \c{segments},
  3735. or \c{symbols} may be specified. Output may be directed to \c{stdout}
  3736. (default), \c{stderr}, or a specified file. E.g.
  3737. \c{[map symbols myfile.map]}. No "user form" exists, the square
  3738. brackets must be used.
  3739. \H{ithfmt} \i\c{ith}: \i{Intel Hex} Output
  3740. The \c{ith} file format produces Intel hex-format files. Just as the
  3741. \c{bin} format, this is a flat memory image format with no support for
  3742. relocation or linking. It is usually used with ROM programmers and
  3743. similar utilities.
  3744. All extensions supported by the \c{bin} file format is also supported by
  3745. the \c{ith} file format.
  3746. \c{ith} provides a default output file-name extension of \c{.ith}.
  3747. \H{srecfmt} \i\c{srec}: \i{Motorola S-Records} Output
  3748. The \c{srec} file format produces Motorola S-records files. Just as the
  3749. \c{bin} format, this is a flat memory image format with no support for
  3750. relocation or linking. It is usually used with ROM programmers and
  3751. similar utilities.
  3752. All extensions supported by the \c{bin} file format is also supported by
  3753. the \c{srec} file format.
  3754. \c{srec} provides a default output file-name extension of \c{.srec}.
  3755. \H{objfmt} \i\c{obj}: \i{Microsoft OMF}\I{OMF} Object Files
  3756. The \c{obj} file format (NASM calls it \c{obj} rather than \c{omf}
  3757. for historical reasons) is the one produced by \i{MASM} and
  3758. \i{TASM}, which is typically fed to 16-bit DOS linkers to produce
  3759. \i\c{.EXE} files. It is also the format used by \i{OS/2}.
  3760. \c{obj} provides a default output file-name extension of \c{.obj}.
  3761. \c{obj} is not exclusively a 16-bit format, though: NASM has full
  3762. support for the 32-bit extensions to the format. In particular,
  3763. 32-bit \c{obj} format files are used by \i{Borland's Win32
  3764. compilers}, instead of using Microsoft's newer \i\c{win32} object
  3765. file format.
  3766. The \c{obj} format does not define any special segment names: you
  3767. can call your segments anything you like. Typical names for segments
  3768. in \c{obj} format files are \c{CODE}, \c{DATA} and \c{BSS}.
  3769. If your source file contains code before specifying an explicit
  3770. \c{SEGMENT} directive, then NASM will invent its own segment called
  3771. \i\c{__NASMDEFSEG} for you.
  3772. When you define a segment in an \c{obj} file, NASM defines the
  3773. segment name as a symbol as well, so that you can access the segment
  3774. address of the segment. So, for example:
  3775. \c segment data
  3776. \c
  3777. \c dvar: dw 1234
  3778. \c
  3779. \c segment code
  3780. \c
  3781. \c function:
  3782. \c mov ax,data ; get segment address of data
  3783. \c mov ds,ax ; and move it into DS
  3784. \c inc word [dvar] ; now this reference will work
  3785. \c ret
  3786. The \c{obj} format also enables the use of the \i\c{SEG} and
  3787. \i\c{WRT} operators, so that you can write code which does things
  3788. like
  3789. \c extern foo
  3790. \c
  3791. \c mov ax,seg foo ; get preferred segment of foo
  3792. \c mov ds,ax
  3793. \c mov ax,data ; a different segment
  3794. \c mov es,ax
  3795. \c mov ax,[ds:foo] ; this accesses `foo'
  3796. \c mov [es:foo wrt data],bx ; so does this
  3797. \S{objseg} \c{obj} Extensions to the \c{SEGMENT}
  3798. Directive\I{SEGMENT, obj extensions to}
  3799. The \c{obj} output format extends the \c{SEGMENT} (or \c{SECTION})
  3800. directive to allow you to specify various properties of the segment
  3801. you are defining. This is done by appending extra qualifiers to the
  3802. end of the segment-definition line. For example,
  3803. \c segment code private align=16
  3804. defines the segment \c{code}, but also declares it to be a private
  3805. segment, and requires that the portion of it described in this code
  3806. module must be aligned on a 16-byte boundary.
  3807. The available qualifiers are:
  3808. \b \i\c{PRIVATE}, \i\c{PUBLIC}, \i\c{COMMON} and \i\c{STACK} specify
  3809. the combination characteristics of the segment. \c{PRIVATE} segments
  3810. do not get combined with any others by the linker; \c{PUBLIC} and
  3811. \c{STACK} segments get concatenated together at link time; and
  3812. \c{COMMON} segments all get overlaid on top of each other rather
  3813. than stuck end-to-end.
  3814. \b \i\c{ALIGN} is used, as shown above, to specify how many low bits
  3815. of the segment start address must be forced to zero. The alignment
  3816. value given may be any power of two from 1 to 4096; in reality, the
  3817. only values supported are 1, 2, 4, 16, 256 and 4096, so if 8 is
  3818. specified it will be rounded up to 16, and 32, 64 and 128 will all
  3819. be rounded up to 256, and so on. Note that alignment to 4096-byte
  3820. boundaries is a \i{PharLap} extension to the format and may not be
  3821. supported by all linkers.\I{section alignment, in OBJ}\I{segment
  3822. alignment, in OBJ}\I{alignment, in OBJ sections}
  3823. \b \i\c{CLASS} can be used to specify the segment class; this feature
  3824. indicates to the linker that segments of the same class should be
  3825. placed near each other in the output file. The class name can be any
  3826. word, e.g. \c{CLASS=CODE}.
  3827. \b \i\c{OVERLAY}, like \c{CLASS}, is specified with an arbitrary word
  3828. as an argument, and provides overlay information to an
  3829. overlay-capable linker.
  3830. \b Segments can be declared as \i\c{USE16} or \i\c{USE32}, which has
  3831. the effect of recording the choice in the object file and also
  3832. ensuring that NASM's default assembly mode when assembling in that
  3833. segment is 16-bit or 32-bit respectively.
  3834. \b When writing \i{OS/2} object files, you should declare 32-bit
  3835. segments as \i\c{FLAT}, which causes the default segment base for
  3836. anything in the segment to be the special group \c{FLAT}, and also
  3837. defines the group if it is not already defined.
  3838. \b The \c{obj} file format also allows segments to be declared as
  3839. having a pre-defined absolute segment address, although no linkers
  3840. are currently known to make sensible use of this feature;
  3841. nevertheless, NASM allows you to declare a segment such as
  3842. \c{SEGMENT SCREEN ABSOLUTE=0xB800} if you need to. The \i\c{ABSOLUTE}
  3843. and \c{ALIGN} keywords are mutually exclusive.
  3844. NASM's default segment attributes are \c{PUBLIC}, \c{ALIGN=1}, no
  3845. class, no overlay, and \c{USE16}.
  3846. \S{group} \i\c{GROUP}: Defining Groups of Segments\I{segments, groups of}
  3847. The \c{obj} format also allows segments to be grouped, so that a
  3848. single segment register can be used to refer to all the segments in
  3849. a group. NASM therefore supplies the \c{GROUP} directive, whereby
  3850. you can code
  3851. \c segment data
  3852. \c
  3853. \c ; some data
  3854. \c
  3855. \c segment bss
  3856. \c
  3857. \c ; some uninitialized data
  3858. \c
  3859. \c group dgroup data bss
  3860. which will define a group called \c{dgroup} to contain the segments
  3861. \c{data} and \c{bss}. Like \c{SEGMENT}, \c{GROUP} causes the group
  3862. name to be defined as a symbol, so that you can refer to a variable
  3863. \c{var} in the \c{data} segment as \c{var wrt data} or as \c{var wrt
  3864. dgroup}, depending on which segment value is currently in your
  3865. segment register.
  3866. If you just refer to \c{var}, however, and \c{var} is declared in a
  3867. segment which is part of a group, then NASM will default to giving
  3868. you the offset of \c{var} from the beginning of the \e{group}, not
  3869. the \e{segment}. Therefore \c{SEG var}, also, will return the group
  3870. base rather than the segment base.
  3871. NASM will allow a segment to be part of more than one group, but
  3872. will generate a warning if you do this. Variables declared in a
  3873. segment which is part of more than one group will default to being
  3874. relative to the first group that was defined to contain the segment.
  3875. A group does not have to contain any segments; you can still make
  3876. \c{WRT} references to a group which does not contain the variable
  3877. you are referring to. OS/2, for example, defines the special group
  3878. \c{FLAT} with no segments in it.
  3879. \S{uppercase} \i\c{UPPERCASE}: Disabling Case Sensitivity in Output
  3880. Although NASM itself is \i{case sensitive}, some OMF linkers are
  3881. not; therefore it can be useful for NASM to output single-case
  3882. object files. The \c{UPPERCASE} format-specific directive causes all
  3883. segment, group and symbol names that are written to the object file
  3884. to be forced to upper case just before being written. Within a
  3885. source file, NASM is still case-sensitive; but the object file can
  3886. be written entirely in upper case if desired.
  3887. \c{UPPERCASE} is used alone on a line; it requires no parameters.
  3888. \S{import} \i\c{IMPORT}: Importing DLL Symbols\I{DLL symbols,
  3889. importing}\I{symbols, importing from DLLs}
  3890. The \c{IMPORT} format-specific directive defines a symbol to be
  3891. imported from a DLL, for use if you are writing a DLL's \i{import
  3892. library} in NASM. You still need to declare the symbol as \c{EXTERN}
  3893. as well as using the \c{IMPORT} directive.
  3894. The \c{IMPORT} directive takes two required parameters, separated by
  3895. white space, which are (respectively) the name of the symbol you
  3896. wish to import and the name of the library you wish to import it
  3897. from. For example:
  3898. \c import WSAStartup wsock32.dll
  3899. A third optional parameter gives the name by which the symbol is
  3900. known in the library you are importing it from, in case this is not
  3901. the same as the name you wish the symbol to be known by to your code
  3902. once you have imported it. For example:
  3903. \c import asyncsel wsock32.dll WSAAsyncSelect
  3904. \S{export} \i\c{EXPORT}: Exporting DLL Symbols\I{DLL symbols,
  3905. exporting}\I{symbols, exporting from DLLs}
  3906. The \c{EXPORT} format-specific directive defines a global symbol to
  3907. be exported as a DLL symbol, for use if you are writing a DLL in
  3908. NASM. You still need to declare the symbol as \c{GLOBAL} as well as
  3909. using the \c{EXPORT} directive.
  3910. \c{EXPORT} takes one required parameter, which is the name of the
  3911. symbol you wish to export, as it was defined in your source file. An
  3912. optional second parameter (separated by white space from the first)
  3913. gives the \e{external} name of the symbol: the name by which you
  3914. wish the symbol to be known to programs using the DLL. If this name
  3915. is the same as the internal name, you may leave the second parameter
  3916. off.
  3917. Further parameters can be given to define attributes of the exported
  3918. symbol. These parameters, like the second, are separated by white
  3919. space. If further parameters are given, the external name must also
  3920. be specified, even if it is the same as the internal name. The
  3921. available attributes are:
  3922. \b \c{resident} indicates that the exported name is to be kept
  3923. resident by the system loader. This is an optimisation for
  3924. frequently used symbols imported by name.
  3925. \b \c{nodata} indicates that the exported symbol is a function which
  3926. does not make use of any initialized data.
  3927. \b \c{parm=NNN}, where \c{NNN} is an integer, sets the number of
  3928. parameter words for the case in which the symbol is a call gate
  3929. between 32-bit and 16-bit segments.
  3930. \b An attribute which is just a number indicates that the symbol
  3931. should be exported with an identifying number (ordinal), and gives
  3932. the desired number.
  3933. For example:
  3934. \c export myfunc
  3935. \c export myfunc TheRealMoreFormalLookingFunctionName
  3936. \c export myfunc myfunc 1234 ; export by ordinal
  3937. \c export myfunc myfunc resident parm=23 nodata
  3938. \S{dotdotstart} \i\c{..start}: Defining the \i{Program Entry
  3939. Point}
  3940. \c{OMF} linkers require exactly one of the object files being linked to
  3941. define the program entry point, where execution will begin when the
  3942. program is run. If the object file that defines the entry point is
  3943. assembled using NASM, you specify the entry point by declaring the
  3944. special symbol \c{..start} at the point where you wish execution to
  3945. begin.
  3946. \S{objextern} \c{obj} Extensions to the \c{EXTERN}
  3947. Directive\I{EXTERN, obj extensions to}
  3948. If you declare an external symbol with the directive
  3949. \c extern foo
  3950. then references such as \c{mov ax,foo} will give you the offset of
  3951. \c{foo} from its preferred segment base (as specified in whichever
  3952. module \c{foo} is actually defined in). So to access the contents of
  3953. \c{foo} you will usually need to do something like
  3954. \c mov ax,seg foo ; get preferred segment base
  3955. \c mov es,ax ; move it into ES
  3956. \c mov ax,[es:foo] ; and use offset `foo' from it
  3957. This is a little unwieldy, particularly if you know that an external
  3958. is going to be accessible from a given segment or group, say
  3959. \c{dgroup}. So if \c{DS} already contained \c{dgroup}, you could
  3960. simply code
  3961. \c mov ax,[foo wrt dgroup]
  3962. However, having to type this every time you want to access \c{foo}
  3963. can be a pain; so NASM allows you to declare \c{foo} in the
  3964. alternative form
  3965. \c extern foo:wrt dgroup
  3966. This form causes NASM to pretend that the preferred segment base of
  3967. \c{foo} is in fact \c{dgroup}; so the expression \c{seg foo} will
  3968. now return \c{dgroup}, and the expression \c{foo} is equivalent to
  3969. \c{foo wrt dgroup}.
  3970. This \I{default-WRT mechanism}default-\c{WRT} mechanism can be used
  3971. to make externals appear to be relative to any group or segment in
  3972. your program. It can also be applied to common variables: see
  3973. \k{objcommon}.
  3974. \S{objcommon} \c{obj} Extensions to the \c{COMMON}
  3975. Directive\I{COMMON, obj extensions to}
  3976. The \c{obj} format allows common variables to be either near\I{near
  3977. common variables} or far\I{far common variables}; NASM allows you to
  3978. specify which your variables should be by the use of the syntax
  3979. \c common nearvar 2:near ; `nearvar' is a near common
  3980. \c common farvar 10:far ; and `farvar' is far
  3981. Far common variables may be greater in size than 64Kb, and so the
  3982. OMF specification says that they are declared as a number of
  3983. \e{elements} of a given size. So a 10-byte far common variable could
  3984. be declared as ten one-byte elements, five two-byte elements, two
  3985. five-byte elements or one ten-byte element.
  3986. Some \c{OMF} linkers require the \I{element size, in common
  3987. variables}\I{common variables, element size}element size, as well as
  3988. the variable size, to match when resolving common variables declared
  3989. in more than one module. Therefore NASM must allow you to specify
  3990. the element size on your far common variables. This is done by the
  3991. following syntax:
  3992. \c common c_5by2 10:far 5 ; two five-byte elements
  3993. \c common c_2by5 10:far 2 ; five two-byte elements
  3994. If no element size is specified, the default is 1. Also, the \c{FAR}
  3995. keyword is not required when an element size is specified, since
  3996. only far commons may have element sizes at all. So the above
  3997. declarations could equivalently be
  3998. \c common c_5by2 10:5 ; two five-byte elements
  3999. \c common c_2by5 10:2 ; five two-byte elements
  4000. In addition to these extensions, the \c{COMMON} directive in \c{obj}
  4001. also supports default-\c{WRT} specification like \c{EXTERN} does
  4002. (explained in \k{objextern}). So you can also declare things like
  4003. \c common foo 10:wrt dgroup
  4004. \c common bar 16:far 2:wrt data
  4005. \c common baz 24:wrt data:6
  4006. \S{objdepend} Embedded File Dependency Information
  4007. Since NASM 2.13.02, \c{obj} files contain embedded dependency file
  4008. information. To suppress the generation of dependencies, use
  4009. \c %pragma obj nodepend
  4010. \H{win32fmt} \i\c{win32}: Microsoft Win32 Object Files
  4011. The \c{win32} output format generates Microsoft Win32 object files,
  4012. suitable for passing to Microsoft linkers such as \i{Visual C++}.
  4013. Note that Borland Win32 compilers do not use this format, but use
  4014. \c{obj} instead (see \k{objfmt}).
  4015. \c{win32} provides a default output file-name extension of \c{.obj}.
  4016. Note that although Microsoft say that Win32 object files follow the
  4017. \c{COFF} (Common Object File Format) standard, the object files produced
  4018. by Microsoft Win32 compilers are not compatible with COFF linkers
  4019. such as DJGPP's, and vice versa. This is due to a difference of
  4020. opinion over the precise semantics of PC-relative relocations. To
  4021. produce COFF files suitable for DJGPP, use NASM's \c{coff} output
  4022. format; conversely, the \c{coff} format does not produce object
  4023. files that Win32 linkers can generate correct output from.
  4024. \S{win32sect} \c{win32} Extensions to the \c{SECTION}
  4025. Directive\I{SECTION, win32 extensions to}
  4026. Like the \c{obj} format, \c{win32} allows you to specify additional
  4027. information on the \c{SECTION} directive line, to control the type
  4028. and properties of sections you declare. Section types and properties
  4029. are generated automatically by NASM for the \i{standard section names}
  4030. \c{.text}, \c{.data} and \c{.bss}, but may still be overridden by
  4031. these qualifiers.
  4032. The available qualifiers are:
  4033. \b \c{code}, or equivalently \c{text}, defines the section to be a
  4034. code section. This marks the section as readable and executable, but
  4035. not writable, and also indicates to the linker that the type of the
  4036. section is code.
  4037. \b \c{data} and \c{bss} define the section to be a data section,
  4038. analogously to \c{code}. Data sections are marked as readable and
  4039. writable, but not executable. \c{data} declares an initialized data
  4040. section, whereas \c{bss} declares an uninitialized data section.
  4041. \b \c{rdata} declares an initialized data section that is readable
  4042. but not writable. Microsoft compilers use this section to place
  4043. constants in it.
  4044. \b \c{info} defines the section to be an \i{informational section},
  4045. which is not included in the executable file by the linker, but may
  4046. (for example) pass information \e{to} the linker. For example,
  4047. declaring an \c{info}-type section called \i\c{.drectve} causes the
  4048. linker to interpret the contents of the section as command-line
  4049. options.
  4050. \b \c{align=}, used with a trailing number as in \c{obj}, gives the
  4051. \I{section alignment, in win32}\I{alignment, in win32
  4052. sections}alignment requirements of the section. The maximum you may
  4053. specify is 64: the Win32 object file format contains no means to
  4054. request a greater section alignment than this. If alignment is not
  4055. explicitly specified, the defaults are 16-byte alignment for code
  4056. sections, 8-byte alignment for rdata sections and 4-byte alignment
  4057. for data (and BSS) sections.
  4058. Informational sections get a default alignment of 1 byte (no
  4059. alignment), though the value does not matter.
  4060. The defaults assumed by NASM if you do not specify the above
  4061. qualifiers are:
  4062. \c section .text code align=16
  4063. \c section .data data align=4
  4064. \c section .rdata rdata align=8
  4065. \c section .bss bss align=4
  4066. Any other section name is treated by default like \c{.text}.
  4067. \S{win32safeseh} \c{win32}: Safe Structured Exception Handling
  4068. Among other improvements in Windows XP SP2 and Windows Server 2003
  4069. Microsoft has introduced concept of "safe structured exception
  4070. handling." General idea is to collect handlers' entry points in
  4071. designated read-only table and have alleged entry point verified
  4072. against this table prior exception control is passed to the handler. In
  4073. order for an executable module to be equipped with such "safe exception
  4074. handler table," all object modules on linker command line has to comply
  4075. with certain criteria. If one single module among them does not, then
  4076. the table in question is omitted and above mentioned run-time checks
  4077. will not be performed for application in question. Table omission is by
  4078. default silent and therefore can be easily overlooked. One can instruct
  4079. linker to refuse to produce binary without such table by passing
  4080. \c{/safeseh} command line option.
  4081. Without regard to this run-time check merits it's natural to expect
  4082. NASM to be capable of generating modules suitable for \c{/safeseh}
  4083. linking. From developer's viewpoint the problem is two-fold:
  4084. \b how to adapt modules not deploying exception handlers of their own;
  4085. \b how to adapt/develop modules utilizing custom exception handling;
  4086. Former can be easily achieved with any NASM version by adding following
  4087. line to source code:
  4088. \c $@feat.00 equ 1
  4089. As of version 2.03 NASM adds this absolute symbol automatically. If
  4090. it's not already present to be precise. I.e. if for whatever reason
  4091. developer would choose to assign another value in source file, it would
  4092. still be perfectly possible.
  4093. Registering custom exception handler on the other hand requires certain
  4094. "magic." As of version 2.03 additional directive is implemented,
  4095. \c{safeseh}, which instructs the assembler to produce appropriately
  4096. formatted input data for above mentioned "safe exception handler
  4097. table." Its typical use would be:
  4098. \c section .text
  4099. \c extern _MessageBoxA@16
  4100. \c %if __NASM_VERSION_ID__ >= 0x02030000
  4101. \c safeseh handler ; register handler as "safe handler"
  4102. \c %endif
  4103. \c handler:
  4104. \c push DWORD 1 ; MB_OKCANCEL
  4105. \c push DWORD caption
  4106. \c push DWORD text
  4107. \c push DWORD 0
  4108. \c call _MessageBoxA@16
  4109. \c sub eax,1 ; incidentally suits as return value
  4110. \c ; for exception handler
  4111. \c ret
  4112. \c global _main
  4113. \c _main:
  4114. \c push DWORD handler
  4115. \c push DWORD [fs:0]
  4116. \c mov DWORD [fs:0],esp ; engage exception handler
  4117. \c xor eax,eax
  4118. \c mov eax,DWORD[eax] ; cause exception
  4119. \c pop DWORD [fs:0] ; disengage exception handler
  4120. \c add esp,4
  4121. \c ret
  4122. \c text: db 'OK to rethrow, CANCEL to generate core dump',0
  4123. \c caption:db 'SEGV',0
  4124. \c
  4125. \c section .drectve info
  4126. \c db '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
  4127. As you might imagine, it's perfectly possible to produce .exe binary
  4128. with "safe exception handler table" and yet engage unregistered
  4129. exception handler. Indeed, handler is engaged by simply manipulating
  4130. \c{[fs:0]} location at run-time, something linker has no power over,
  4131. run-time that is. It should be explicitly mentioned that such failure
  4132. to register handler's entry point with \c{safeseh} directive has
  4133. undesired side effect at run-time. If exception is raised and
  4134. unregistered handler is to be executed, the application is abruptly
  4135. terminated without any notification whatsoever. One can argue that
  4136. system could at least have logged some kind "non-safe exception
  4137. handler in x.exe at address n" message in event log, but no, literally
  4138. no notification is provided and user is left with no clue on what
  4139. caused application failure.
  4140. Finally, all mentions of linker in this paragraph refer to Microsoft
  4141. linker version 7.x and later. Presence of \c{@feat.00} symbol and input
  4142. data for "safe exception handler table" causes no backward
  4143. incompatibilities and "safeseh" modules generated by NASM 2.03 and
  4144. later can still be linked by earlier versions or non-Microsoft linkers.
  4145. \S{codeview} Debugging formats for Windows
  4146. \I{Windows debugging formats}
  4147. The \c{win32} and \c{win64} formats support the Microsoft CodeView
  4148. debugging format. Currently CodeView version 8 format is supported
  4149. (\i\c{cv8}), but newer versions of the CodeView debugger should be
  4150. able to handle this format as well.
  4151. \H{win64fmt} \i\c{win64}: Microsoft Win64 Object Files
  4152. The \c{win64} output format generates Microsoft Win64 object files,
  4153. which is nearly 100% identical to the \c{win32} object format (\k{win32fmt})
  4154. with the exception that it is meant to target 64-bit code and the x86-64
  4155. platform altogether. This object file is used exactly the same as the \c{win32}
  4156. object format (\k{win32fmt}), in NASM, with regard to this exception.
  4157. \S{win64pic} \c{win64}: Writing Position-Independent Code
  4158. While \c{REL} takes good care of RIP-relative addressing, there is one
  4159. aspect that is easy to overlook for a Win64 programmer: indirect
  4160. references. Consider a switch dispatch table:
  4161. \c jmp qword [dsptch+rax*8]
  4162. \c ...
  4163. \c dsptch: dq case0
  4164. \c dq case1
  4165. \c ...
  4166. Even a novice Win64 assembler programmer will soon realize that the code
  4167. is not 64-bit savvy. Most notably linker will refuse to link it with
  4168. \c 'ADDR32' relocation to '.text' invalid without /LARGEADDRESSAWARE:NO
  4169. So [s]he will have to split jmp instruction as following:
  4170. \c lea rbx,[rel dsptch]
  4171. \c jmp qword [rbx+rax*8]
  4172. What happens behind the scene is that effective address in \c{lea} is
  4173. encoded relative to instruction pointer, or in perfectly
  4174. position-independent manner. But this is only part of the problem!
  4175. Trouble is that in .dll context \c{caseN} relocations will make their
  4176. way to the final module and might have to be adjusted at .dll load
  4177. time. To be specific when it can't be loaded at preferred address. And
  4178. when this occurs, pages with such relocations will be rendered private
  4179. to current process, which kind of undermines the idea of sharing .dll.
  4180. But no worry, it's trivial to fix:
  4181. \c lea rbx,[rel dsptch]
  4182. \c add rbx,[rbx+rax*8]
  4183. \c jmp rbx
  4184. \c ...
  4185. \c dsptch: dq case0-dsptch
  4186. \c dq case1-dsptch
  4187. \c ...
  4188. NASM version 2.03 and later provides another alternative, \c{wrt
  4189. ..imagebase} operator, which returns offset from base address of the
  4190. current image, be it .exe or .dll module, therefore the name. For those
  4191. acquainted with PE-COFF format base address denotes start of
  4192. \c{IMAGE_DOS_HEADER} structure. Here is how to implement switch with
  4193. these image-relative references:
  4194. \c lea rbx,[rel dsptch]
  4195. \c mov eax,[rbx+rax*4]
  4196. \c sub rbx,dsptch wrt ..imagebase
  4197. \c add rbx,rax
  4198. \c jmp rbx
  4199. \c ...
  4200. \c dsptch: dd case0 wrt ..imagebase
  4201. \c dd case1 wrt ..imagebase
  4202. One can argue that the operator is redundant. Indeed, snippet before
  4203. last works just fine with any NASM version and is not even Windows
  4204. specific... The real reason for implementing \c{wrt ..imagebase} will
  4205. become apparent in next paragraph.
  4206. It should be noted that \c{wrt ..imagebase} is defined as 32-bit
  4207. operand only:
  4208. \c dd label wrt ..imagebase ; ok
  4209. \c dq label wrt ..imagebase ; bad
  4210. \c mov eax,label wrt ..imagebase ; ok
  4211. \c mov rax,label wrt ..imagebase ; bad
  4212. \S{win64seh} \c{win64}: Structured Exception Handling
  4213. Structured exception handing in Win64 is completely different matter
  4214. from Win32. Upon exception program counter value is noted, and
  4215. linker-generated table comprising start and end addresses of all the
  4216. functions [in given executable module] is traversed and compared to the
  4217. saved program counter. Thus so called \c{UNWIND_INFO} structure is
  4218. identified. If it's not found, then offending subroutine is assumed to
  4219. be "leaf" and just mentioned lookup procedure is attempted for its
  4220. caller. In Win64 leaf function is such function that does not call any
  4221. other function \e{nor} modifies any Win64 non-volatile registers,
  4222. including stack pointer. The latter ensures that it's possible to
  4223. identify leaf function's caller by simply pulling the value from the
  4224. top of the stack.
  4225. While majority of subroutines written in assembler are not calling any
  4226. other function, requirement for non-volatile registers' immutability
  4227. leaves developer with not more than 7 registers and no stack frame,
  4228. which is not necessarily what [s]he counted with. Customarily one would
  4229. meet the requirement by saving non-volatile registers on stack and
  4230. restoring them upon return, so what can go wrong? If [and only if] an
  4231. exception is raised at run-time and no \c{UNWIND_INFO} structure is
  4232. associated with such "leaf" function, the stack unwind procedure will
  4233. expect to find caller's return address on the top of stack immediately
  4234. followed by its frame. Given that developer pushed caller's
  4235. non-volatile registers on stack, would the value on top point at some
  4236. code segment or even addressable space? Well, developer can attempt
  4237. copying caller's return address to the top of stack and this would
  4238. actually work in some very specific circumstances. But unless developer
  4239. can guarantee that these circumstances are always met, it's more
  4240. appropriate to assume worst case scenario, i.e. stack unwind procedure
  4241. going berserk. Relevant question is what happens then? Application is
  4242. abruptly terminated without any notification whatsoever. Just like in
  4243. Win32 case, one can argue that system could at least have logged
  4244. "unwind procedure went berserk in x.exe at address n" in event log, but
  4245. no, no trace of failure is left.
  4246. Now, when we understand significance of the \c{UNWIND_INFO} structure,
  4247. let's discuss what's in it and/or how it's processed. First of all it
  4248. is checked for presence of reference to custom language-specific
  4249. exception handler. If there is one, then it's invoked. Depending on the
  4250. return value, execution flow is resumed (exception is said to be
  4251. "handled"), \e{or} rest of \c{UNWIND_INFO} structure is processed as
  4252. following. Beside optional reference to custom handler, it carries
  4253. information about current callee's stack frame and where non-volatile
  4254. registers are saved. Information is detailed enough to be able to
  4255. reconstruct contents of caller's non-volatile registers upon call to
  4256. current callee. And so caller's context is reconstructed, and then
  4257. unwind procedure is repeated, i.e. another \c{UNWIND_INFO} structure is
  4258. associated, this time, with caller's instruction pointer, which is then
  4259. checked for presence of reference to language-specific handler, etc.
  4260. The procedure is recursively repeated till exception is handled. As
  4261. last resort system "handles" it by generating memory core dump and
  4262. terminating the application.
  4263. As for the moment of this writing NASM unfortunately does not
  4264. facilitate generation of above mentioned detailed information about
  4265. stack frame layout. But as of version 2.03 it implements building
  4266. blocks for generating structures involved in stack unwinding. As
  4267. simplest example, here is how to deploy custom exception handler for
  4268. leaf function:
  4269. \c default rel
  4270. \c section .text
  4271. \c extern MessageBoxA
  4272. \c handler:
  4273. \c sub rsp,40
  4274. \c mov rcx,0
  4275. \c lea rdx,[text]
  4276. \c lea r8,[caption]
  4277. \c mov r9,1 ; MB_OKCANCEL
  4278. \c call MessageBoxA
  4279. \c sub eax,1 ; incidentally suits as return value
  4280. \c ; for exception handler
  4281. \c add rsp,40
  4282. \c ret
  4283. \c global main
  4284. \c main:
  4285. \c xor rax,rax
  4286. \c mov rax,QWORD[rax] ; cause exception
  4287. \c ret
  4288. \c main_end:
  4289. \c text: db 'OK to rethrow, CANCEL to generate core dump',0
  4290. \c caption:db 'SEGV',0
  4291. \c
  4292. \c section .pdata rdata align=4
  4293. \c dd main wrt ..imagebase
  4294. \c dd main_end wrt ..imagebase
  4295. \c dd xmain wrt ..imagebase
  4296. \c section .xdata rdata align=8
  4297. \c xmain: db 9,0,0,0
  4298. \c dd handler wrt ..imagebase
  4299. \c section .drectve info
  4300. \c db '/defaultlib:user32.lib /defaultlib:msvcrt.lib '
  4301. What you see in \c{.pdata} section is element of the "table comprising
  4302. start and end addresses of function" along with reference to associated
  4303. \c{UNWIND_INFO} structure. And what you see in \c{.xdata} section is
  4304. \c{UNWIND_INFO} structure describing function with no frame, but with
  4305. designated exception handler. References are \e{required} to be
  4306. image-relative (which is the real reason for implementing \c{wrt
  4307. ..imagebase} operator). It should be noted that \c{rdata align=n}, as
  4308. well as \c{wrt ..imagebase}, are optional in these two segments'
  4309. contexts, i.e. can be omitted. Latter means that \e{all} 32-bit
  4310. references, not only above listed required ones, placed into these two
  4311. segments turn out image-relative. Why is it important to understand?
  4312. Developer is allowed to append handler-specific data to \c{UNWIND_INFO}
  4313. structure, and if [s]he adds a 32-bit reference, then [s]he will have
  4314. to remember to adjust its value to obtain the real pointer.
  4315. As already mentioned, in Win64 terms leaf function is one that does not
  4316. call any other function \e{nor} modifies any non-volatile register,
  4317. including stack pointer. But it's not uncommon that assembler
  4318. programmer plans to utilize every single register and sometimes even
  4319. have variable stack frame. Is there anything one can do with bare
  4320. building blocks? I.e. besides manually composing fully-fledged
  4321. \c{UNWIND_INFO} structure, which would surely be considered
  4322. error-prone? Yes, there is. Recall that exception handler is called
  4323. first, before stack layout is analyzed. As it turned out, it's
  4324. perfectly possible to manipulate current callee's context in custom
  4325. handler in manner that permits further stack unwinding. General idea is
  4326. that handler would not actually "handle" the exception, but instead
  4327. restore callee's context, as it was at its entry point and thus mimic
  4328. leaf function. In other words, handler would simply undertake part of
  4329. unwinding procedure. Consider following example:
  4330. \c function:
  4331. \c mov rax,rsp ; copy rsp to volatile register
  4332. \c push r15 ; save non-volatile registers
  4333. \c push rbx
  4334. \c push rbp
  4335. \c mov r11,rsp ; prepare variable stack frame
  4336. \c sub r11,rcx
  4337. \c and r11,-64
  4338. \c mov QWORD[r11],rax ; check for exceptions
  4339. \c mov rsp,r11 ; allocate stack frame
  4340. \c mov QWORD[rsp],rax ; save original rsp value
  4341. \c magic_point:
  4342. \c ...
  4343. \c mov r11,QWORD[rsp] ; pull original rsp value
  4344. \c mov rbp,QWORD[r11-24]
  4345. \c mov rbx,QWORD[r11-16]
  4346. \c mov r15,QWORD[r11-8]
  4347. \c mov rsp,r11 ; destroy frame
  4348. \c ret
  4349. The keyword is that up to \c{magic_point} original \c{rsp} value
  4350. remains in chosen volatile register and no non-volatile register,
  4351. except for \c{rsp}, is modified. While past \c{magic_point} \c{rsp}
  4352. remains constant till the very end of the \c{function}. In this case
  4353. custom language-specific exception handler would look like this:
  4354. \c EXCEPTION_DISPOSITION handler (EXCEPTION_RECORD *rec,ULONG64 frame,
  4355. \c CONTEXT *context,DISPATCHER_CONTEXT *disp)
  4356. \c { ULONG64 *rsp;
  4357. \c if (context->Rip<(ULONG64)magic_point)
  4358. \c rsp = (ULONG64 *)context->Rax;
  4359. \c else
  4360. \c { rsp = ((ULONG64 **)context->Rsp)[0];
  4361. \c context->Rbp = rsp[-3];
  4362. \c context->Rbx = rsp[-2];
  4363. \c context->R15 = rsp[-1];
  4364. \c }
  4365. \c context->Rsp = (ULONG64)rsp;
  4366. \c
  4367. \c memcpy (disp->ContextRecord,context,sizeof(CONTEXT));
  4368. \c RtlVirtualUnwind(UNW_FLAG_NHANDLER,disp->ImageBase,
  4369. \c dips->ControlPc,disp->FunctionEntry,disp->ContextRecord,
  4370. \c &disp->HandlerData,&disp->EstablisherFrame,NULL);
  4371. \c return ExceptionContinueSearch;
  4372. \c }
  4373. As custom handler mimics leaf function, corresponding \c{UNWIND_INFO}
  4374. structure does not have to contain any information about stack frame
  4375. and its layout.
  4376. \H{cofffmt} \i\c{coff}: \i{Common Object File Format}
  4377. The \c{coff} output type produces \c{COFF} object files suitable for
  4378. linking with the \i{DJGPP} linker.
  4379. \c{coff} provides a default output file-name extension of \c{.o}.
  4380. The \c{coff} format supports the same extensions to the \c{SECTION}
  4381. directive as \c{win32} does, except that the \c{align} qualifier and
  4382. the \c{info} section type are not supported.
  4383. \H{machofmt} \I{Mach-O}\i\c{macho32} and \i\c{macho64}: \i{Mach Object File Format}
  4384. The \c{macho32} and \c{macho64} output formts produces Mach-O
  4385. object files suitable for linking with the \i{MacOS X} linker.
  4386. \i\c{macho} is a synonym for \c{macho32}.
  4387. \c{macho} provides a default output file-name extension of \c{.o}.
  4388. \S{machosect} \c{macho} extensions to the \c{SECTION} Directive
  4389. \I{SECTION, macho extensions to}
  4390. The \c{macho} output format specifies section names in the format
  4391. "\e{segment}\c{,}\e{section}". No spaces are allowed around the
  4392. comma. The following flags can also be specified:
  4393. \b \c{data} - this section contains initialized data items
  4394. \b \c{code} - this section contains code exclusively
  4395. \b \c{mixed} - this section contains both code and data
  4396. \b \c{bss} - this section is uninitialized and filled with zero
  4397. \b \c{zerofill} - same as \c{bss}
  4398. \b \c{no_dead_strip} - inhibit dead code stripping for this section
  4399. \b \c{live_support} - set the live support flag for this section
  4400. \b \c{strip_static_syms} - strip static symbols for this section
  4401. \b \c{debug} - this section contains debugging information
  4402. \b \c{align=}\e{alignment} - specify section alignment
  4403. The default is \c{data}, unless the section name is \c{__text} or
  4404. \c{__bss} in which case the default is \c{text} or \c{bss},
  4405. respectively.
  4406. For compatibility with other Unix platforms, the following standard
  4407. names are also supported:
  4408. \c .text = __TEXT,__text text
  4409. \c .rodata = __DATA,__const data
  4410. \c .data = __DATA,__data data
  4411. \c .bss = __DATA,__bss bss
  4412. If the \c{.rodata} section contains no relocations, it is instead put
  4413. into the \c{__TEXT,__const} section unless this section has already
  4414. been specified explicitly. However, it is probably better to specify
  4415. \c{__TEXT,__const} and \c{__DATA,__const} explicitly as appropriate.
  4416. \S{machotls} \i{Thread Local Storage in Mach-O}\I{TLS}: \c{macho} special
  4417. symbols and \i\c{WRT}
  4418. Mach-O defines the following special symbols that can be used on the
  4419. right-hand side of the \c{WRT} operator:
  4420. \b \c{..tlvp} is used to specify access to thread-local storage.
  4421. \b \c{..gotpcrel} is used to specify references to the Global Offset
  4422. Table. The GOT is supported in the \c{macho64} format only.
  4423. \S{macho-ssvs} \c{macho} specfic directive \i\c{subsections_via_symbols}
  4424. The directive \c{subsections_via_symbols} sets the
  4425. \c{MH_SUBSECTIONS_VIA_SYMBOLS} flag in the Mach-O header, that effectively
  4426. separates a block (or a subsection) based on a symbol. It is often used
  4427. for eliminating dead codes by a linker.
  4428. This directive takes no arguments.
  4429. This is a macro implemented as a \c{%pragma}. It can also be
  4430. specified in its \c{%pragma} form, in which case it will not affect
  4431. non-Mach-O builds of the same source code:
  4432. \c %pragma macho subsections_via_symbols
  4433. \S{macho-ssvs} \c{macho} specfic directive \i\c{no_dead_strip}
  4434. The directive \c{no_dead_strip} sets the Mach-O \c{SH_NO_DEAD_STRIP}
  4435. section flag on the section containing a a specific symbol. This
  4436. directive takes a list of symbols as its arguments.
  4437. This is a macro implemented as a \c{%pragma}. It can also be
  4438. specified in its \c{%pragma} form, in which case it will not affect
  4439. non-Mach-O builds of the same source code:
  4440. \c %pragma macho no_dead_strip symbol...
  4441. \S{macho-pext} \c{macho} specific extensions to the \c{GLOBAL}
  4442. Directive: \i\c{private_extern}
  4443. The directive extension to \c{GLOBAL} marks the symbol with limited
  4444. global scope. For example, you can specify the global symbol with
  4445. this extension:
  4446. \c global foo:private_extern
  4447. \c foo:
  4448. \c ; codes
  4449. Using with static linker will clear the private extern attribute.
  4450. But linker option like \c{-keep_private_externs} can avoid it.
  4451. \H{elffmt} \i\c{elf32}, \i\c{elf64}, \i\c{elfx32}: \I{ELF}\I{linux, elf}\i{Executable and Linkable
  4452. Format} Object Files
  4453. The \c{elf32}, \c{elf64} and \c{elfx32} output formats generate
  4454. \c{ELF32 and ELF64} (Executable and Linkable Format) object files, as
  4455. used by Linux as well as \i{Unix System V}, including \i{Solaris x86},
  4456. \i{UnixWare} and \i{SCO Unix}. \c{elf} provides a default output
  4457. file-name extension of \c{.o}. \c{elf} is a synonym for \c{elf32}.
  4458. The \c{elfx32} format is used for the \i{x32} ABI, which is a 32-bit
  4459. ABI with the CPU in 64-bit mode.
  4460. \S{abisect} ELF specific directive \i\c{osabi}
  4461. The ELF header specifies the application binary interface for the
  4462. target operating system (OSABI). This field can be set by using the
  4463. \c{osabi} directive with the numeric value (0-255) of the target
  4464. system. If this directive is not used, the default value will be "UNIX
  4465. System V ABI" (0) which will work on most systems which support ELF.
  4466. \S{elfsect} \c{elf} extensions to the \c{SECTION} Directive
  4467. \I{SECTION, elf extensions to}
  4468. Like the \c{obj} format, \c{elf} allows you to specify additional
  4469. information on the \c{SECTION} directive line, to control the type
  4470. and properties of sections you declare. Section types and properties
  4471. are generated automatically by NASM for the \i{standard section
  4472. names}, but may still be
  4473. overridden by these qualifiers.
  4474. The available qualifiers are:
  4475. \b \i\c{alloc} defines the section to be one which is loaded into
  4476. memory when the program is run. \i\c{noalloc} defines it to be one
  4477. which is not, such as an informational or comment section.
  4478. \b \i\c{exec} defines the section to be one which should have execute
  4479. permission when the program is run. \i\c{noexec} defines it as one
  4480. which should not.
  4481. \b \i\c{write} defines the section to be one which should be writable
  4482. when the program is run. \i\c{nowrite} defines it as one which should
  4483. not.
  4484. \b \i\c{progbits} defines the section to be one with explicit contents
  4485. stored in the object file: an ordinary code or data section, for
  4486. example, \i\c{nobits} defines the section to be one with no explicit
  4487. contents given, such as a BSS section.
  4488. \b \c{align=}, used with a trailing number as in \c{obj}, gives the
  4489. \I{section alignment, in elf}\I{alignment, in elf sections}alignment
  4490. requirements of the section.
  4491. \b \i\c{tls} defines the section to be one which contains
  4492. thread local variables.
  4493. The defaults assumed by NASM if you do not specify the above
  4494. qualifiers are:
  4495. \I\c{.text} \I\c{.rodata} \I\c{.lrodata} \I\c{.data} \I\c{.ldata}
  4496. \I\c{.bss} \I\c{.lbss} \I\c{.tdata} \I\c{.tbss} \I\c\{.comment}
  4497. \c section .text progbits alloc exec nowrite align=16
  4498. \c section .rodata progbits alloc noexec nowrite align=4
  4499. \c section .lrodata progbits alloc noexec nowrite align=4
  4500. \c section .data progbits alloc noexec write align=4
  4501. \c section .ldata progbits alloc noexec write align=4
  4502. \c section .bss nobits alloc noexec write align=4
  4503. \c section .lbss nobits alloc noexec write align=4
  4504. \c section .tdata progbits alloc noexec write align=4 tls
  4505. \c section .tbss nobits alloc noexec write align=4 tls
  4506. \c section .comment progbits noalloc noexec nowrite align=1
  4507. \c section other progbits alloc noexec nowrite align=1
  4508. (Any section name other than those in the above table
  4509. is treated by default like \c{other} in the above table.
  4510. Please note that section names are case sensitive.)
  4511. \S{elfwrt} \i{Position-Independent Code}\I{PIC}: \c{macho} Special
  4512. Symbols and \i\c{WRT}
  4513. Since \c{ELF} does not support segment-base references, the \c{WRT}
  4514. operator is not used for its normal purpose; therefore NASM's
  4515. \c{elf} output format makes use of \c{WRT} for a different purpose,
  4516. namely the PIC-specific \I{relocations, PIC-specific}relocation
  4517. types.
  4518. \c{elf} defines five special symbols which you can use as the
  4519. right-hand side of the \c{WRT} operator to obtain PIC relocation
  4520. types. They are \i\c{..gotpc}, \i\c{..gotoff}, \i\c{..got},
  4521. \i\c{..plt} and \i\c{..sym}. Their functions are summarized here:
  4522. \b Referring to the symbol marking the global offset table base
  4523. using \c{wrt ..gotpc} will end up giving the distance from the
  4524. beginning of the current section to the global offset table.
  4525. (\i\c{_GLOBAL_OFFSET_TABLE_} is the standard symbol name used to
  4526. refer to the \i{GOT}.) So you would then need to add \i\c{$$} to the
  4527. result to get the real address of the GOT.
  4528. \b Referring to a location in one of your own sections using \c{wrt
  4529. ..gotoff} will give the distance from the beginning of the GOT to
  4530. the specified location, so that adding on the address of the GOT
  4531. would give the real address of the location you wanted.
  4532. \b Referring to an external or global symbol using \c{wrt ..got}
  4533. causes the linker to build an entry \e{in} the GOT containing the
  4534. address of the symbol, and the reference gives the distance from the
  4535. beginning of the GOT to the entry; so you can add on the address of
  4536. the GOT, load from the resulting address, and end up with the
  4537. address of the symbol.
  4538. \b Referring to a procedure name using \c{wrt ..plt} causes the
  4539. linker to build a \i{procedure linkage table} entry for the symbol,
  4540. and the reference gives the address of the \i{PLT} entry. You can
  4541. only use this in contexts which would generate a PC-relative
  4542. relocation normally (i.e. as the destination for \c{CALL} or
  4543. \c{JMP}), since ELF contains no relocation type to refer to PLT
  4544. entries absolutely.
  4545. \b Referring to a symbol name using \c{wrt ..sym} causes NASM to
  4546. write an ordinary relocation, but instead of making the relocation
  4547. relative to the start of the section and then adding on the offset
  4548. to the symbol, it will write a relocation record aimed directly at
  4549. the symbol in question. The distinction is a necessary one due to a
  4550. peculiarity of the dynamic linker.
  4551. A fuller explanation of how to use these relocation types to write
  4552. shared libraries entirely in NASM is given in \k{picdll}.
  4553. \S{elftls} \i{Thread Local Storage in ELF}\I{TLS}: \c{elf} Special
  4554. Symbols and \i\c{WRT}
  4555. \b In ELF32 mode, referring to an external or global symbol using
  4556. \c{wrt ..tlsie} \I\c{..tlsie}
  4557. causes the linker to build an entry \e{in} the GOT containing the
  4558. offset of the symbol within the TLS block, so you can access the value
  4559. of the symbol with code such as:
  4560. \c mov eax,[tid wrt ..tlsie]
  4561. \c mov [gs:eax],ebx
  4562. \b In ELF64 or ELFx32 mode, referring to an external or global symbol using
  4563. \c{wrt ..gottpoff} \I\c{..gottpoff}
  4564. causes the linker to build an entry \e{in} the GOT containing the
  4565. offset of the symbol within the TLS block, so you can access the value
  4566. of the symbol with code such as:
  4567. \c mov rax,[rel tid wrt ..gottpoff]
  4568. \c mov rcx,[fs:rax]
  4569. \S{elfglob} \c{elf} Extensions to the \c{GLOBAL} Directive\I{GLOBAL,
  4570. elf extensions to}\I{GLOBAL, aoutb extensions to}
  4571. \c{ELF} object files can contain more information about a global symbol
  4572. than just its address: they can contain the \I{symbol sizes,
  4573. specifying}\I{size, of symbols}size of the symbol and its \I{symbol
  4574. types, specifying}\I{type, of symbols}type as well. These are not
  4575. merely debugger conveniences, but are actually necessary when the
  4576. program being written is a \i{shared library}. NASM therefore
  4577. supports some extensions to the \c{GLOBAL} directive, allowing you
  4578. to specify these features.
  4579. You can specify whether a global variable is a function or a data
  4580. object by suffixing the name with a colon and the word
  4581. \i\c{function} or \i\c{data}. (\i\c{object} is a synonym for
  4582. \c{data}.) For example:
  4583. \c global hashlookup:function, hashtable:data
  4584. exports the global symbol \c{hashlookup} as a function and
  4585. \c{hashtable} as a data object.
  4586. Optionally, you can control the ELF visibility of the symbol. Just
  4587. add one of the visibility keywords: \i\c{default}, \i\c{internal},
  4588. \i\c{hidden}, or \i\c{protected}. The default is \i\c{default} of
  4589. course. For example, to make \c{hashlookup} hidden:
  4590. \c global hashlookup:function hidden
  4591. You can also specify the size of the data associated with the
  4592. symbol, as a numeric expression (which may involve labels, and even
  4593. forward references) after the type specifier. Like this:
  4594. \c global hashtable:data (hashtable.end - hashtable)
  4595. \c
  4596. \c hashtable:
  4597. \c db this,that,theother ; some data here
  4598. \c .end:
  4599. This makes NASM automatically calculate the length of the table and
  4600. place that information into the \c{ELF} symbol table.
  4601. Declaring the type and size of global symbols is necessary when
  4602. writing shared library code. For more information, see
  4603. \k{picglobal}.
  4604. \S{elfcomm} \c{elf} Extensions to the \c{COMMON} Directive
  4605. \I{COMMON, elf extensions to}
  4606. \c{ELF} also allows you to specify alignment requirements \I{common
  4607. variables, alignment in elf}\I{alignment, of elf common variables}on
  4608. common variables. This is done by putting a number (which must be a
  4609. power of two) after the name and size of the common variable,
  4610. separated (as usual) by a colon. For example, an array of
  4611. doublewords would benefit from 4-byte alignment:
  4612. \c common dwordarray 128:4
  4613. This declares the total size of the array to be 128 bytes, and
  4614. requires that it be aligned on a 4-byte boundary.
  4615. \S{elf16} 16-bit code and ELF
  4616. \I{ELF, 16-bit code and}
  4617. The \c{ELF32} specification doesn't provide relocations for 8- and
  4618. 16-bit values, but the GNU \c{ld} linker adds these as an extension.
  4619. NASM can generate GNU-compatible relocations, to allow 16-bit code to
  4620. be linked as ELF using GNU \c{ld}. If NASM is used with the
  4621. \c{-w+gnu-elf-extensions} option, a warning is issued when one of
  4622. these relocations is generated.
  4623. \S{elfdbg} Debug formats and ELF
  4624. \I{ELF, Debug formats and}
  4625. ELF provides debug information in \c{STABS} and \c{DWARF} formats.
  4626. Line number information is generated for all executable sections, but please
  4627. note that only the ".text" section is executable by default.
  4628. \H{aoutfmt} \i\c{aout}: Linux \I{a.out, Linux version}\I{linux, a.out}\c{a.out} Object Files
  4629. The \c{aout} format generates \c{a.out} object files, in the form used
  4630. by early Linux systems (current Linux systems use ELF, see
  4631. \k{elffmt}.) These differ from other \c{a.out} object files in that
  4632. the magic number in the first four bytes of the file is
  4633. different; also, some implementations of \c{a.out}, for example
  4634. NetBSD's, support position-independent code, which Linux's
  4635. implementation does not.
  4636. \c{a.out} provides a default output file-name extension of \c{.o}.
  4637. \c{a.out} is a very simple object format. It supports no special
  4638. directives, no special symbols, no use of \c{SEG} or \c{WRT}, and no
  4639. extensions to any standard directives. It supports only the three
  4640. \i{standard section names} \i\c{.text}, \i\c{.data} and \i\c{.bss}.
  4641. \H{aoutfmt} \i\c{aoutb}: \i{NetBSD}/\i{FreeBSD}/\i{OpenBSD}
  4642. \I{a.out, BSD version}\c{a.out} Object Files
  4643. The \c{aoutb} format generates \c{a.out} object files, in the form
  4644. used by the various free \c{BSD Unix} clones, \c{NetBSD}, \c{FreeBSD}
  4645. and \c{OpenBSD}. For simple object files, this object format is exactly
  4646. the same as \c{aout} except for the magic number in the first four bytes
  4647. of the file. However, the \c{aoutb} format supports
  4648. \I{PIC}\i{position-independent code} in the same way as the \c{elf}
  4649. format, so you can use it to write \c{BSD} \i{shared libraries}.
  4650. \c{aoutb} provides a default output file-name extension of \c{.o}.
  4651. \c{aoutb} supports no special directives, no special symbols, and
  4652. only the three \i{standard section names} \i\c{.text}, \i\c{.data}
  4653. and \i\c{.bss}. However, it also supports the same use of \i\c{WRT} as
  4654. \c{elf} does, to provide position-independent code relocation types.
  4655. See \k{elfwrt} for full documentation of this feature.
  4656. \c{aoutb} also supports the same extensions to the \c{GLOBAL}
  4657. directive as \c{elf} does: see \k{elfglob} for documentation of
  4658. this.
  4659. \H{as86fmt} \c{as86}: \i{Minix}/Linux\I{linux, as86} \i\c{as86} Object Files
  4660. The Minix/Linux 16-bit assembler \c{as86} has its own non-standard
  4661. object file format. Although its companion linker \i\c{ld86} produces
  4662. something close to ordinary \c{a.out} binaries as output, the object
  4663. file format used to communicate between \c{as86} and \c{ld86} is not
  4664. itself \c{a.out}.
  4665. NASM supports this format, just in case it is useful, as \c{as86}.
  4666. \c{as86} provides a default output file-name extension of \c{.o}.
  4667. \c{as86} is a very simple object format (from the NASM user's point
  4668. of view). It supports no special directives, no use of \c{SEG} or \c{WRT},
  4669. and no extensions to any standard directives. It supports only the three
  4670. \i{standard section names} \i\c{.text}, \i\c{.data} and \i\c{.bss}. The
  4671. only special symbol supported is \c{..start}.
  4672. \H{rdffmt} \I{RDOFF}\i\c{rdf}: \i{Relocatable Dynamic Object File
  4673. Format}
  4674. The \c{rdf} output format produces \c{RDOFF} object files. \c{RDOFF}
  4675. (Relocatable Dynamic Object File Format) is a home-grown object-file
  4676. format, designed alongside NASM itself and reflecting in its file
  4677. format the internal structure of the assembler.
  4678. \c{RDOFF} is not used by any well-known operating systems. Those
  4679. writing their own systems, however, may well wish to use \c{RDOFF}
  4680. as their object format, on the grounds that it is designed primarily
  4681. for simplicity and contains very little file-header bureaucracy.
  4682. The Unix NASM archive, and the DOS archive which includes sources,
  4683. both contain an \I{rdoff subdirectory}\c{rdoff} subdirectory holding
  4684. a set of RDOFF utilities: an RDF linker, an \c{RDF} static-library
  4685. manager, an RDF file dump utility, and a program which will load and
  4686. execute an RDF executable under Linux.
  4687. \c{rdf} supports only the \i{standard section names} \i\c{.text},
  4688. \i\c{.data} and \i\c{.bss}.
  4689. \S{rdflib} Requiring a Library: The \i\c{LIBRARY} Directive
  4690. \c{RDOFF} contains a mechanism for an object file to demand a given
  4691. library to be linked to the module, either at load time or run time.
  4692. This is done by the \c{LIBRARY} directive, which takes one argument
  4693. which is the name of the module:
  4694. \c library mylib.rdl
  4695. \S{rdfmod} Specifying a Module Name: The \i\c{MODULE} Directive
  4696. Special \c{RDOFF} header record is used to store the name of the module.
  4697. It can be used, for example, by run-time loader to perform dynamic
  4698. linking. \c{MODULE} directive takes one argument which is the name
  4699. of current module:
  4700. \c module mymodname
  4701. Note that when you statically link modules and tell linker to strip
  4702. the symbols from output file, all module names will be stripped too.
  4703. To avoid it, you should start module names with \I{$, prefix}\c{$}, like:
  4704. \c module $kernel.core
  4705. \S{rdfglob} \c{rdf} Extensions to the \c{GLOBAL} Directive\I{GLOBAL,
  4706. rdf extensions to}
  4707. \c{RDOFF} global symbols can contain additional information needed by
  4708. the static linker. You can mark a global symbol as exported, thus
  4709. telling the linker do not strip it from target executable or library
  4710. file. Like in \c{ELF}, you can also specify whether an exported symbol
  4711. is a procedure (function) or data object.
  4712. Suffixing the name with a colon and the word \i\c{export} you make the
  4713. symbol exported:
  4714. \c global sys_open:export
  4715. To specify that exported symbol is a procedure (function), you add the
  4716. word \i\c{proc} or \i\c{function} after declaration:
  4717. \c global sys_open:export proc
  4718. Similarly, to specify exported data object, add the word \i\c{data}
  4719. or \i\c{object} to the directive:
  4720. \c global kernel_ticks:export data
  4721. \S{rdfimpt} \c{rdf} Extensions to the \c{EXTERN} Directive\I{EXTERN,
  4722. rdf extensions to}
  4723. By default the \c{EXTERN} directive in \c{RDOFF} declares a "pure external"
  4724. symbol (i.e. the static linker will complain if such a symbol is not resolved).
  4725. To declare an "imported" symbol, which must be resolved later during a dynamic
  4726. linking phase, \c{RDOFF} offers an additional \c{import} modifier. As in
  4727. \c{GLOBAL}, you can also specify whether an imported symbol is a procedure
  4728. (function) or data object. For example:
  4729. \c library $libc
  4730. \c extern _open:import
  4731. \c extern _printf:import proc
  4732. \c extern _errno:import data
  4733. Here the directive \c{LIBRARY} is also included, which gives the dynamic linker
  4734. a hint as to where to find requested symbols.
  4735. \H{dbgfmt} \i\c{dbg}: Debugging Format
  4736. The \c{dbg} format does not output an object file as such; instead,
  4737. it outputs a text file which contains a complete list of all the
  4738. transactions between the main body of NASM and the output-format
  4739. back end module. It is primarily intended to aid people who want to
  4740. write their own output drivers, so that they can get a clearer idea
  4741. of the various requests the main program makes of the output driver,
  4742. and in what order they happen.
  4743. For simple files, one can easily use the \c{dbg} format like this:
  4744. \c nasm -f dbg filename.asm
  4745. which will generate a diagnostic file called \c{filename.dbg}.
  4746. However, this will not work well on files which were designed for a
  4747. different object format, because each object format defines its own
  4748. macros (usually user-level forms of directives), and those macros
  4749. will not be defined in the \c{dbg} format. Therefore it can be
  4750. useful to run NASM twice, in order to do the preprocessing with the
  4751. native object format selected:
  4752. \c nasm -e -f rdf -o rdfprog.i rdfprog.asm
  4753. \c nasm -a -f dbg rdfprog.i
  4754. This preprocesses \c{rdfprog.asm} into \c{rdfprog.i}, keeping the
  4755. \c{rdf} object format selected in order to make sure RDF special
  4756. directives are converted into primitive form correctly. Then the
  4757. preprocessed source is fed through the \c{dbg} format to generate
  4758. the final diagnostic output.
  4759. This workaround will still typically not work for programs intended
  4760. for \c{obj} format, because the \c{obj} \c{SEGMENT} and \c{GROUP}
  4761. directives have side effects of defining the segment and group names
  4762. as symbols; \c{dbg} will not do this, so the program will not
  4763. assemble. You will have to work around that by defining the symbols
  4764. yourself (using \c{EXTERN}, for example) if you really need to get a
  4765. \c{dbg} trace of an \c{obj}-specific source file.
  4766. \c{dbg} accepts any section name and any directives at all, and logs
  4767. them all to its output file.
  4768. \c{dbg} accepts and logs any \c{%pragma}, but the specific
  4769. \c{%pragma}:
  4770. \c %pragma dbg maxdump <size>
  4771. where \c{<size>} is either a number or \c{unlimited}, can be used to
  4772. control the maximum size for dumping the full contents of a
  4773. \c{rawdata} output object.
  4774. \C{16bit} Writing 16-bit Code (DOS, Windows 3/3.1)
  4775. This chapter attempts to cover some of the common issues encountered
  4776. when writing 16-bit code to run under \c{MS-DOS} or \c{Windows 3.x}. It
  4777. covers how to link programs to produce \c{.EXE} or \c{.COM} files,
  4778. how to write \c{.SYS} device drivers, and how to interface assembly
  4779. language code with 16-bit C compilers and with Borland Pascal.
  4780. \H{exefiles} Producing \i\c{.EXE} Files
  4781. Any large program written under DOS needs to be built as a \c{.EXE}
  4782. file: only \c{.EXE} files have the necessary internal structure
  4783. required to span more than one 64K segment. \i{Windows} programs,
  4784. also, have to be built as \c{.EXE} files, since Windows does not
  4785. support the \c{.COM} format.
  4786. In general, you generate \c{.EXE} files by using the \c{obj} output
  4787. format to produce one or more \i\c{.OBJ} files, and then linking
  4788. them together using a linker. However, NASM also supports the direct
  4789. generation of simple DOS \c{.EXE} files using the \c{bin} output
  4790. format (by using \c{DB} and \c{DW} to construct the \c{.EXE} file
  4791. header), and a macro package is supplied to do this. Thanks to
  4792. Yann Guidon for contributing the code for this.
  4793. NASM may also support \c{.EXE} natively as another output format in
  4794. future releases.
  4795. \S{objexe} Using the \c{obj} Format To Generate \c{.EXE} Files
  4796. This section describes the usual method of generating \c{.EXE} files
  4797. by linking \c{.OBJ} files together.
  4798. Most 16-bit programming language packages come with a suitable
  4799. linker; if you have none of these, there is a free linker called
  4800. \i{VAL}\I{linker, free}, available in \c{LZH} archive format from
  4801. \W{ftp://x2ftp.oulu.fi/pub/msdos/programming/lang/}\i\c{x2ftp.oulu.fi}.
  4802. An LZH archiver can be found at
  4803. \W{ftp://ftp.simtel.net/pub/simtelnet/msdos/arcers}\i\c{ftp.simtel.net}.
  4804. There is another `free' linker (though this one doesn't come with
  4805. sources) called \i{FREELINK}, available from
  4806. \W{http://www.pcorner.com/tpc/old/3-101.html}\i\c{www.pcorner.com}.
  4807. A third, \i\c{djlink}, written by DJ Delorie, is available at
  4808. \W{http://www.delorie.com/djgpp/16bit/djlink/}\i\c{www.delorie.com}.
  4809. A fourth linker, \i\c{ALINK}, written by Anthony A.J. Williams, is
  4810. available at \W{http://alink.sourceforge.net}\i\c{alink.sourceforge.net}.
  4811. When linking several \c{.OBJ} files into a \c{.EXE} file, you should
  4812. ensure that exactly one of them has a start point defined (using the
  4813. \I{program entry point}\i\c{..start} special symbol defined by the
  4814. \c{obj} format: see \k{dotdotstart}). If no module defines a start
  4815. point, the linker will not know what value to give the entry-point
  4816. field in the output file header; if more than one defines a start
  4817. point, the linker will not know \e{which} value to use.
  4818. An example of a NASM source file which can be assembled to a
  4819. \c{.OBJ} file and linked on its own to a \c{.EXE} is given here. It
  4820. demonstrates the basic principles of defining a stack, initialising
  4821. the segment registers, and declaring a start point. This file is
  4822. also provided in the \I{test subdirectory}\c{test} subdirectory of
  4823. the NASM archives, under the name \c{objexe.asm}.
  4824. \c segment code
  4825. \c
  4826. \c ..start:
  4827. \c mov ax,data
  4828. \c mov ds,ax
  4829. \c mov ax,stack
  4830. \c mov ss,ax
  4831. \c mov sp,stacktop
  4832. This initial piece of code sets up \c{DS} to point to the data
  4833. segment, and initializes \c{SS} and \c{SP} to point to the top of
  4834. the provided stack. Notice that interrupts are implicitly disabled
  4835. for one instruction after a move into \c{SS}, precisely for this
  4836. situation, so that there's no chance of an interrupt occurring
  4837. between the loads of \c{SS} and \c{SP} and not having a stack to
  4838. execute on.
  4839. Note also that the special symbol \c{..start} is defined at the
  4840. beginning of this code, which means that will be the entry point
  4841. into the resulting executable file.
  4842. \c mov dx,hello
  4843. \c mov ah,9
  4844. \c int 0x21
  4845. The above is the main program: load \c{DS:DX} with a pointer to the
  4846. greeting message (\c{hello} is implicitly relative to the segment
  4847. \c{data}, which was loaded into \c{DS} in the setup code, so the
  4848. full pointer is valid), and call the DOS print-string function.
  4849. \c mov ax,0x4c00
  4850. \c int 0x21
  4851. This terminates the program using another DOS system call.
  4852. \c segment data
  4853. \c
  4854. \c hello: db 'hello, world', 13, 10, '$'
  4855. The data segment contains the string we want to display.
  4856. \c segment stack stack
  4857. \c resb 64
  4858. \c stacktop:
  4859. The above code declares a stack segment containing 64 bytes of
  4860. uninitialized stack space, and points \c{stacktop} at the top of it.
  4861. The directive \c{segment stack stack} defines a segment \e{called}
  4862. \c{stack}, and also of \e{type} \c{STACK}. The latter is not
  4863. necessary to the correct running of the program, but linkers are
  4864. likely to issue warnings or errors if your program has no segment of
  4865. type \c{STACK}.
  4866. The above file, when assembled into a \c{.OBJ} file, will link on
  4867. its own to a valid \c{.EXE} file, which when run will print `hello,
  4868. world' and then exit.
  4869. \S{binexe} Using the \c{bin} Format To Generate \c{.EXE} Files
  4870. The \c{.EXE} file format is simple enough that it's possible to
  4871. build a \c{.EXE} file by writing a pure-binary program and sticking
  4872. a 32-byte header on the front. This header is simple enough that it
  4873. can be generated using \c{DB} and \c{DW} commands by NASM itself, so
  4874. that you can use the \c{bin} output format to directly generate
  4875. \c{.EXE} files.
  4876. Included in the NASM archives, in the \I{misc subdirectory}\c{misc}
  4877. subdirectory, is a file \i\c{exebin.mac} of macros. It defines three
  4878. macros: \i\c{EXE_begin}, \i\c{EXE_stack} and \i\c{EXE_end}.
  4879. To produce a \c{.EXE} file using this method, you should start by
  4880. using \c{%include} to load the \c{exebin.mac} macro package into
  4881. your source file. You should then issue the \c{EXE_begin} macro call
  4882. (which takes no arguments) to generate the file header data. Then
  4883. write code as normal for the \c{bin} format - you can use all three
  4884. standard sections \c{.text}, \c{.data} and \c{.bss}. At the end of
  4885. the file you should call the \c{EXE_end} macro (again, no arguments),
  4886. which defines some symbols to mark section sizes, and these symbols
  4887. are referred to in the header code generated by \c{EXE_begin}.
  4888. In this model, the code you end up writing starts at \c{0x100}, just
  4889. like a \c{.COM} file - in fact, if you strip off the 32-byte header
  4890. from the resulting \c{.EXE} file, you will have a valid \c{.COM}
  4891. program. All the segment bases are the same, so you are limited to a
  4892. 64K program, again just like a \c{.COM} file. Note that an \c{ORG}
  4893. directive is issued by the \c{EXE_begin} macro, so you should not
  4894. explicitly issue one of your own.
  4895. You can't directly refer to your segment base value, unfortunately,
  4896. since this would require a relocation in the header, and things
  4897. would get a lot more complicated. So you should get your segment
  4898. base by copying it out of \c{CS} instead.
  4899. On entry to your \c{.EXE} file, \c{SS:SP} are already set up to
  4900. point to the top of a 2Kb stack. You can adjust the default stack
  4901. size of 2Kb by calling the \c{EXE_stack} macro. For example, to
  4902. change the stack size of your program to 64 bytes, you would call
  4903. \c{EXE_stack 64}.
  4904. A sample program which generates a \c{.EXE} file in this way is
  4905. given in the \c{test} subdirectory of the NASM archive, as
  4906. \c{binexe.asm}.
  4907. \H{comfiles} Producing \i\c{.COM} Files
  4908. While large DOS programs must be written as \c{.EXE} files, small
  4909. ones are often better written as \c{.COM} files. \c{.COM} files are
  4910. pure binary, and therefore most easily produced using the \c{bin}
  4911. output format.
  4912. \S{combinfmt} Using the \c{bin} Format To Generate \c{.COM} Files
  4913. \c{.COM} files expect to be loaded at offset \c{100h} into their
  4914. segment (though the segment may change). Execution then begins at
  4915. \I\c{ORG}\c{100h}, i.e. right at the start of the program. So to
  4916. write a \c{.COM} program, you would create a source file looking
  4917. like
  4918. \c org 100h
  4919. \c
  4920. \c section .text
  4921. \c
  4922. \c start:
  4923. \c ; put your code here
  4924. \c
  4925. \c section .data
  4926. \c
  4927. \c ; put data items here
  4928. \c
  4929. \c section .bss
  4930. \c
  4931. \c ; put uninitialized data here
  4932. The \c{bin} format puts the \c{.text} section first in the file, so
  4933. you can declare data or BSS items before beginning to write code if
  4934. you want to and the code will still end up at the front of the file
  4935. where it belongs.
  4936. The BSS (uninitialized data) section does not take up space in the
  4937. \c{.COM} file itself: instead, addresses of BSS items are resolved
  4938. to point at space beyond the end of the file, on the grounds that
  4939. this will be free memory when the program is run. Therefore you
  4940. should not rely on your BSS being initialized to all zeros when you
  4941. run.
  4942. To assemble the above program, you should use a command line like
  4943. \c nasm myprog.asm -fbin -o myprog.com
  4944. The \c{bin} format would produce a file called \c{myprog} if no
  4945. explicit output file name were specified, so you have to override it
  4946. and give the desired file name.
  4947. \S{comobjfmt} Using the \c{obj} Format To Generate \c{.COM} Files
  4948. If you are writing a \c{.COM} program as more than one module, you
  4949. may wish to assemble several \c{.OBJ} files and link them together
  4950. into a \c{.COM} program. You can do this, provided you have a linker
  4951. capable of outputting \c{.COM} files directly (\i{TLINK} does this),
  4952. or alternatively a converter program such as \i\c{EXE2BIN} to
  4953. transform the \c{.EXE} file output from the linker into a \c{.COM}
  4954. file.
  4955. If you do this, you need to take care of several things:
  4956. \b The first object file containing code should start its code
  4957. segment with a line like \c{RESB 100h}. This is to ensure that the
  4958. code begins at offset \c{100h} relative to the beginning of the code
  4959. segment, so that the linker or converter program does not have to
  4960. adjust address references within the file when generating the
  4961. \c{.COM} file. Other assemblers use an \i\c{ORG} directive for this
  4962. purpose, but \c{ORG} in NASM is a format-specific directive to the
  4963. \c{bin} output format, and does not mean the same thing as it does
  4964. in MASM-compatible assemblers.
  4965. \b You don't need to define a stack segment.
  4966. \b All your segments should be in the same group, so that every time
  4967. your code or data references a symbol offset, all offsets are
  4968. relative to the same segment base. This is because, when a \c{.COM}
  4969. file is loaded, all the segment registers contain the same value.
  4970. \H{sysfiles} Producing \i\c{.SYS} Files
  4971. \i{MS-DOS device drivers} - \c{.SYS} files - are pure binary files,
  4972. similar to \c{.COM} files, except that they start at origin zero
  4973. rather than \c{100h}. Therefore, if you are writing a device driver
  4974. using the \c{bin} format, you do not need the \c{ORG} directive,
  4975. since the default origin for \c{bin} is zero. Similarly, if you are
  4976. using \c{obj}, you do not need the \c{RESB 100h} at the start of
  4977. your code segment.
  4978. \c{.SYS} files start with a header structure, containing pointers to
  4979. the various routines inside the driver which do the work. This
  4980. structure should be defined at the start of the code segment, even
  4981. though it is not actually code.
  4982. For more information on the format of \c{.SYS} files, and the data
  4983. which has to go in the header structure, a list of books is given in
  4984. the Frequently Asked Questions list for the newsgroup
  4985. \W{news:comp.os.msdos.programmer}\i\c{comp.os.msdos.programmer}.
  4986. \H{16c} Interfacing to 16-bit C Programs
  4987. This section covers the basics of writing assembly routines that
  4988. call, or are called from, C programs. To do this, you would
  4989. typically write an assembly module as a \c{.OBJ} file, and link it
  4990. with your C modules to produce a \i{mixed-language program}.
  4991. \S{16cunder} External Symbol Names
  4992. \I{C symbol names}\I{underscore, in C symbols}C compilers have the
  4993. convention that the names of all global symbols (functions or data)
  4994. they define are formed by prefixing an underscore to the name as it
  4995. appears in the C program. So, for example, the function a C
  4996. programmer thinks of as \c{printf} appears to an assembly language
  4997. programmer as \c{_printf}. This means that in your assembly
  4998. programs, you can define symbols without a leading underscore, and
  4999. not have to worry about name clashes with C symbols.
  5000. If you find the underscores inconvenient, you can define macros to
  5001. replace the \c{GLOBAL} and \c{EXTERN} directives as follows:
  5002. \c %macro cglobal 1
  5003. \c
  5004. \c global _%1
  5005. \c %define %1 _%1
  5006. \c
  5007. \c %endmacro
  5008. \c
  5009. \c %macro cextern 1
  5010. \c
  5011. \c extern _%1
  5012. \c %define %1 _%1
  5013. \c
  5014. \c %endmacro
  5015. (These forms of the macros only take one argument at a time; a
  5016. \c{%rep} construct could solve this.)
  5017. If you then declare an external like this:
  5018. \c cextern printf
  5019. then the macro will expand it as
  5020. \c extern _printf
  5021. \c %define printf _printf
  5022. Thereafter, you can reference \c{printf} as if it was a symbol, and
  5023. the preprocessor will put the leading underscore on where necessary.
  5024. The \c{cglobal} macro works similarly. You must use \c{cglobal}
  5025. before defining the symbol in question, but you would have had to do
  5026. that anyway if you used \c{GLOBAL}.
  5027. Also see \k{opt-pfix}.
  5028. \S{16cmodels} \i{Memory Models}
  5029. NASM contains no mechanism to support the various C memory models
  5030. directly; you have to keep track yourself of which one you are
  5031. writing for. This means you have to keep track of the following
  5032. things:
  5033. \b In models using a single code segment (tiny, small and compact),
  5034. functions are near. This means that function pointers, when stored
  5035. in data segments or pushed on the stack as function arguments, are
  5036. 16 bits long and contain only an offset field (the \c{CS} register
  5037. never changes its value, and always gives the segment part of the
  5038. full function address), and that functions are called using ordinary
  5039. near \c{CALL} instructions and return using \c{RETN} (which, in
  5040. NASM, is synonymous with \c{RET} anyway). This means both that you
  5041. should write your own routines to return with \c{RETN}, and that you
  5042. should call external C routines with near \c{CALL} instructions.
  5043. \b In models using more than one code segment (medium, large and
  5044. huge), functions are far. This means that function pointers are 32
  5045. bits long (consisting of a 16-bit offset followed by a 16-bit
  5046. segment), and that functions are called using \c{CALL FAR} (or
  5047. \c{CALL seg:offset}) and return using \c{RETF}. Again, you should
  5048. therefore write your own routines to return with \c{RETF} and use
  5049. \c{CALL FAR} to call external routines.
  5050. \b In models using a single data segment (tiny, small and medium),
  5051. data pointers are 16 bits long, containing only an offset field (the
  5052. \c{DS} register doesn't change its value, and always gives the
  5053. segment part of the full data item address).
  5054. \b In models using more than one data segment (compact, large and
  5055. huge), data pointers are 32 bits long, consisting of a 16-bit offset
  5056. followed by a 16-bit segment. You should still be careful not to
  5057. modify \c{DS} in your routines without restoring it afterwards, but
  5058. \c{ES} is free for you to use to access the contents of 32-bit data
  5059. pointers you are passed.
  5060. \b The huge memory model allows single data items to exceed 64K in
  5061. size. In all other memory models, you can access the whole of a data
  5062. item just by doing arithmetic on the offset field of the pointer you
  5063. are given, whether a segment field is present or not; in huge model,
  5064. you have to be more careful of your pointer arithmetic.
  5065. \b In most memory models, there is a \e{default} data segment, whose
  5066. segment address is kept in \c{DS} throughout the program. This data
  5067. segment is typically the same segment as the stack, kept in \c{SS},
  5068. so that functions' local variables (which are stored on the stack)
  5069. and global data items can both be accessed easily without changing
  5070. \c{DS}. Particularly large data items are typically stored in other
  5071. segments. However, some memory models (though not the standard
  5072. ones, usually) allow the assumption that \c{SS} and \c{DS} hold the
  5073. same value to be removed. Be careful about functions' local
  5074. variables in this latter case.
  5075. In models with a single code segment, the segment is called
  5076. \i\c{_TEXT}, so your code segment must also go by this name in order
  5077. to be linked into the same place as the main code segment. In models
  5078. with a single data segment, or with a default data segment, it is
  5079. called \i\c{_DATA}.
  5080. \S{16cfunc} Function Definitions and Function Calls
  5081. \I{functions, C calling convention}The \i{C calling convention} in
  5082. 16-bit programs is as follows. In the following description, the
  5083. words \e{caller} and \e{callee} are used to denote the function
  5084. doing the calling and the function which gets called.
  5085. \b The caller pushes the function's parameters on the stack, one
  5086. after another, in reverse order (right to left, so that the first
  5087. argument specified to the function is pushed last).
  5088. \b The caller then executes a \c{CALL} instruction to pass control
  5089. to the callee. This \c{CALL} is either near or far depending on the
  5090. memory model.
  5091. \b The callee receives control, and typically (although this is not
  5092. actually necessary, in functions which do not need to access their
  5093. parameters) starts by saving the value of \c{SP} in \c{BP} so as to
  5094. be able to use \c{BP} as a base pointer to find its parameters on
  5095. the stack. However, the caller was probably doing this too, so part
  5096. of the calling convention states that \c{BP} must be preserved by
  5097. any C function. Hence the callee, if it is going to set up \c{BP} as
  5098. a \i\e{frame pointer}, must push the previous value first.
  5099. \b The callee may then access its parameters relative to \c{BP}.
  5100. The word at \c{[BP]} holds the previous value of \c{BP} as it was
  5101. pushed; the next word, at \c{[BP+2]}, holds the offset part of the
  5102. return address, pushed implicitly by \c{CALL}. In a small-model
  5103. (near) function, the parameters start after that, at \c{[BP+4]}; in
  5104. a large-model (far) function, the segment part of the return address
  5105. lives at \c{[BP+4]}, and the parameters begin at \c{[BP+6]}. The
  5106. leftmost parameter of the function, since it was pushed last, is
  5107. accessible at this offset from \c{BP}; the others follow, at
  5108. successively greater offsets. Thus, in a function such as \c{printf}
  5109. which takes a variable number of parameters, the pushing of the
  5110. parameters in reverse order means that the function knows where to
  5111. find its first parameter, which tells it the number and type of the
  5112. remaining ones.
  5113. \b The callee may also wish to decrease \c{SP} further, so as to
  5114. allocate space on the stack for local variables, which will then be
  5115. accessible at negative offsets from \c{BP}.
  5116. \b The callee, if it wishes to return a value to the caller, should
  5117. leave the value in \c{AL}, \c{AX} or \c{DX:AX} depending on the size
  5118. of the value. Floating-point results are sometimes (depending on the
  5119. compiler) returned in \c{ST0}.
  5120. \b Once the callee has finished processing, it restores \c{SP} from
  5121. \c{BP} if it had allocated local stack space, then pops the previous
  5122. value of \c{BP}, and returns via \c{RETN} or \c{RETF} depending on
  5123. memory model.
  5124. \b When the caller regains control from the callee, the function
  5125. parameters are still on the stack, so it typically adds an immediate
  5126. constant to \c{SP} to remove them (instead of executing a number of
  5127. slow \c{POP} instructions). Thus, if a function is accidentally
  5128. called with the wrong number of parameters due to a prototype
  5129. mismatch, the stack will still be returned to a sensible state since
  5130. the caller, which \e{knows} how many parameters it pushed, does the
  5131. removing.
  5132. It is instructive to compare this calling convention with that for
  5133. Pascal programs (described in \k{16bpfunc}). Pascal has a simpler
  5134. convention, since no functions have variable numbers of parameters.
  5135. Therefore the callee knows how many parameters it should have been
  5136. passed, and is able to deallocate them from the stack itself by
  5137. passing an immediate argument to the \c{RET} or \c{RETF}
  5138. instruction, so the caller does not have to do it. Also, the
  5139. parameters are pushed in left-to-right order, not right-to-left,
  5140. which means that a compiler can give better guarantees about
  5141. sequence points without performance suffering.
  5142. Thus, you would define a function in C style in the following way.
  5143. The following example is for small model:
  5144. \c global _myfunc
  5145. \c
  5146. \c _myfunc:
  5147. \c push bp
  5148. \c mov bp,sp
  5149. \c sub sp,0x40 ; 64 bytes of local stack space
  5150. \c mov bx,[bp+4] ; first parameter to function
  5151. \c
  5152. \c ; some more code
  5153. \c
  5154. \c mov sp,bp ; undo "sub sp,0x40" above
  5155. \c pop bp
  5156. \c ret
  5157. For a large-model function, you would replace \c{RET} by \c{RETF},
  5158. and look for the first parameter at \c{[BP+6]} instead of
  5159. \c{[BP+4]}. Of course, if one of the parameters is a pointer, then
  5160. the offsets of \e{subsequent} parameters will change depending on
  5161. the memory model as well: far pointers take up four bytes on the
  5162. stack when passed as a parameter, whereas near pointers take up two.
  5163. At the other end of the process, to call a C function from your
  5164. assembly code, you would do something like this:
  5165. \c extern _printf
  5166. \c
  5167. \c ; and then, further down...
  5168. \c
  5169. \c push word [myint] ; one of my integer variables
  5170. \c push word mystring ; pointer into my data segment
  5171. \c call _printf
  5172. \c add sp,byte 4 ; `byte' saves space
  5173. \c
  5174. \c ; then those data items...
  5175. \c
  5176. \c segment _DATA
  5177. \c
  5178. \c myint dw 1234
  5179. \c mystring db 'This number -> %d <- should be 1234',10,0
  5180. This piece of code is the small-model assembly equivalent of the C
  5181. code
  5182. \c int myint = 1234;
  5183. \c printf("This number -> %d <- should be 1234\n", myint);
  5184. In large model, the function-call code might look more like this. In
  5185. this example, it is assumed that \c{DS} already holds the segment
  5186. base of the segment \c{_DATA}. If not, you would have to initialize
  5187. it first.
  5188. \c push word [myint]
  5189. \c push word seg mystring ; Now push the segment, and...
  5190. \c push word mystring ; ... offset of "mystring"
  5191. \c call far _printf
  5192. \c add sp,byte 6
  5193. The integer value still takes up one word on the stack, since large
  5194. model does not affect the size of the \c{int} data type. The first
  5195. argument (pushed last) to \c{printf}, however, is a data pointer,
  5196. and therefore has to contain a segment and offset part. The segment
  5197. should be stored second in memory, and therefore must be pushed
  5198. first. (Of course, \c{PUSH DS} would have been a shorter instruction
  5199. than \c{PUSH WORD SEG mystring}, if \c{DS} was set up as the above
  5200. example assumed.) Then the actual call becomes a far call, since
  5201. functions expect far calls in large model; and \c{SP} has to be
  5202. increased by 6 rather than 4 afterwards to make up for the extra
  5203. word of parameters.
  5204. \S{16cdata} Accessing Data Items
  5205. To get at the contents of C variables, or to declare variables which
  5206. C can access, you need only declare the names as \c{GLOBAL} or
  5207. \c{EXTERN}. (Again, the names require leading underscores, as stated
  5208. in \k{16cunder}.) Thus, a C variable declared as \c{int i} can be
  5209. accessed from assembler as
  5210. \c extern _i
  5211. \c
  5212. \c mov ax,[_i]
  5213. And to declare your own integer variable which C programs can access
  5214. as \c{extern int j}, you do this (making sure you are assembling in
  5215. the \c{_DATA} segment, if necessary):
  5216. \c global _j
  5217. \c
  5218. \c _j dw 0
  5219. To access a C array, you need to know the size of the components of
  5220. the array. For example, \c{int} variables are two bytes long, so if
  5221. a C program declares an array as \c{int a[10]}, you can access
  5222. \c{a[3]} by coding \c{mov ax,[_a+6]}. (The byte offset 6 is obtained
  5223. by multiplying the desired array index, 3, by the size of the array
  5224. element, 2.) The sizes of the C base types in 16-bit compilers are:
  5225. 1 for \c{char}, 2 for \c{short} and \c{int}, 4 for \c{long} and
  5226. \c{float}, and 8 for \c{double}.
  5227. To access a C \i{data structure}, you need to know the offset from
  5228. the base of the structure to the field you are interested in. You
  5229. can either do this by converting the C structure definition into a
  5230. NASM structure definition (using \i\c{STRUC}), or by calculating the
  5231. one offset and using just that.
  5232. To do either of these, you should read your C compiler's manual to
  5233. find out how it organizes data structures. NASM gives no special
  5234. alignment to structure members in its own \c{STRUC} macro, so you
  5235. have to specify alignment yourself if the C compiler generates it.
  5236. Typically, you might find that a structure like
  5237. \c struct {
  5238. \c char c;
  5239. \c int i;
  5240. \c } foo;
  5241. might be four bytes long rather than three, since the \c{int} field
  5242. would be aligned to a two-byte boundary. However, this sort of
  5243. feature tends to be a configurable option in the C compiler, either
  5244. using command-line options or \c{#pragma} lines, so you have to find
  5245. out how your own compiler does it.
  5246. \S{16cmacro} \i\c{c16.mac}: Helper Macros for the 16-bit C Interface
  5247. Included in the NASM archives, in the \I{misc subdirectory}\c{misc}
  5248. directory, is a file \c{c16.mac} of macros. It defines three macros:
  5249. \i\c{proc}, \i\c{arg} and \i\c{endproc}. These are intended to be
  5250. used for C-style procedure definitions, and they automate a lot of
  5251. the work involved in keeping track of the calling convention.
  5252. (An alternative, TASM compatible form of \c{arg} is also now built
  5253. into NASM's preprocessor. See \k{stackrel} for details.)
  5254. An example of an assembly function using the macro set is given
  5255. here:
  5256. \c proc _nearproc
  5257. \c
  5258. \c %$i arg
  5259. \c %$j arg
  5260. \c mov ax,[bp + %$i]
  5261. \c mov bx,[bp + %$j]
  5262. \c add ax,[bx]
  5263. \c
  5264. \c endproc
  5265. This defines \c{_nearproc} to be a procedure taking two arguments,
  5266. the first (\c{i}) an integer and the second (\c{j}) a pointer to an
  5267. integer. It returns \c{i + *j}.
  5268. Note that the \c{arg} macro has an \c{EQU} as the first line of its
  5269. expansion, and since the label before the macro call gets prepended
  5270. to the first line of the expanded macro, the \c{EQU} works, defining
  5271. \c{%$i} to be an offset from \c{BP}. A context-local variable is
  5272. used, local to the context pushed by the \c{proc} macro and popped
  5273. by the \c{endproc} macro, so that the same argument name can be used
  5274. in later procedures. Of course, you don't \e{have} to do that.
  5275. The macro set produces code for near functions (tiny, small and
  5276. compact-model code) by default. You can have it generate far
  5277. functions (medium, large and huge-model code) by means of coding
  5278. \I\c{FARCODE}\c{%define FARCODE}. This changes the kind of return
  5279. instruction generated by \c{endproc}, and also changes the starting
  5280. point for the argument offsets. The macro set contains no intrinsic
  5281. dependency on whether data pointers are far or not.
  5282. \c{arg} can take an optional parameter, giving the size of the
  5283. argument. If no size is given, 2 is assumed, since it is likely that
  5284. many function parameters will be of type \c{int}.
  5285. The large-model equivalent of the above function would look like this:
  5286. \c %define FARCODE
  5287. \c
  5288. \c proc _farproc
  5289. \c
  5290. \c %$i arg
  5291. \c %$j arg 4
  5292. \c mov ax,[bp + %$i]
  5293. \c mov bx,[bp + %$j]
  5294. \c mov es,[bp + %$j + 2]
  5295. \c add ax,[bx]
  5296. \c
  5297. \c endproc
  5298. This makes use of the argument to the \c{arg} macro to define a
  5299. parameter of size 4, because \c{j} is now a far pointer. When we
  5300. load from \c{j}, we must load a segment and an offset.
  5301. \H{16bp} Interfacing to \i{Borland Pascal} Programs
  5302. Interfacing to Borland Pascal programs is similar in concept to
  5303. interfacing to 16-bit C programs. The differences are:
  5304. \b The leading underscore required for interfacing to C programs is
  5305. not required for Pascal.
  5306. \b The memory model is always large: functions are far, data
  5307. pointers are far, and no data item can be more than 64K long.
  5308. (Actually, some functions are near, but only those functions that
  5309. are local to a Pascal unit and never called from outside it. All
  5310. assembly functions that Pascal calls, and all Pascal functions that
  5311. assembly routines are able to call, are far.) However, all static
  5312. data declared in a Pascal program goes into the default data
  5313. segment, which is the one whose segment address will be in \c{DS}
  5314. when control is passed to your assembly code. The only things that
  5315. do not live in the default data segment are local variables (they
  5316. live in the stack segment) and dynamically allocated variables. All
  5317. data \e{pointers}, however, are far.
  5318. \b The function calling convention is different - described below.
  5319. \b Some data types, such as strings, are stored differently.
  5320. \b There are restrictions on the segment names you are allowed to
  5321. use - Borland Pascal will ignore code or data declared in a segment
  5322. it doesn't like the name of. The restrictions are described below.
  5323. \S{16bpfunc} The Pascal Calling Convention
  5324. \I{functions, Pascal calling convention}\I{Pascal calling
  5325. convention}The 16-bit Pascal calling convention is as follows. In
  5326. the following description, the words \e{caller} and \e{callee} are
  5327. used to denote the function doing the calling and the function which
  5328. gets called.
  5329. \b The caller pushes the function's parameters on the stack, one
  5330. after another, in normal order (left to right, so that the first
  5331. argument specified to the function is pushed first).
  5332. \b The caller then executes a far \c{CALL} instruction to pass
  5333. control to the callee.
  5334. \b The callee receives control, and typically (although this is not
  5335. actually necessary, in functions which do not need to access their
  5336. parameters) starts by saving the value of \c{SP} in \c{BP} so as to
  5337. be able to use \c{BP} as a base pointer to find its parameters on
  5338. the stack. However, the caller was probably doing this too, so part
  5339. of the calling convention states that \c{BP} must be preserved by
  5340. any function. Hence the callee, if it is going to set up \c{BP} as a
  5341. \i{frame pointer}, must push the previous value first.
  5342. \b The callee may then access its parameters relative to \c{BP}.
  5343. The word at \c{[BP]} holds the previous value of \c{BP} as it was
  5344. pushed. The next word, at \c{[BP+2]}, holds the offset part of the
  5345. return address, and the next one at \c{[BP+4]} the segment part. The
  5346. parameters begin at \c{[BP+6]}. The rightmost parameter of the
  5347. function, since it was pushed last, is accessible at this offset
  5348. from \c{BP}; the others follow, at successively greater offsets.
  5349. \b The callee may also wish to decrease \c{SP} further, so as to
  5350. allocate space on the stack for local variables, which will then be
  5351. accessible at negative offsets from \c{BP}.
  5352. \b The callee, if it wishes to return a value to the caller, should
  5353. leave the value in \c{AL}, \c{AX} or \c{DX:AX} depending on the size
  5354. of the value. Floating-point results are returned in \c{ST0}.
  5355. Results of type \c{Real} (Borland's own custom floating-point data
  5356. type, not handled directly by the FPU) are returned in \c{DX:BX:AX}.
  5357. To return a result of type \c{String}, the caller pushes a pointer
  5358. to a temporary string before pushing the parameters, and the callee
  5359. places the returned string value at that location. The pointer is
  5360. not a parameter, and should not be removed from the stack by the
  5361. \c{RETF} instruction.
  5362. \b Once the callee has finished processing, it restores \c{SP} from
  5363. \c{BP} if it had allocated local stack space, then pops the previous
  5364. value of \c{BP}, and returns via \c{RETF}. It uses the form of
  5365. \c{RETF} with an immediate parameter, giving the number of bytes
  5366. taken up by the parameters on the stack. This causes the parameters
  5367. to be removed from the stack as a side effect of the return
  5368. instruction.
  5369. \b When the caller regains control from the callee, the function
  5370. parameters have already been removed from the stack, so it needs to
  5371. do nothing further.
  5372. Thus, you would define a function in Pascal style, taking two
  5373. \c{Integer}-type parameters, in the following way:
  5374. \c global myfunc
  5375. \c
  5376. \c myfunc: push bp
  5377. \c mov bp,sp
  5378. \c sub sp,0x40 ; 64 bytes of local stack space
  5379. \c mov bx,[bp+8] ; first parameter to function
  5380. \c mov bx,[bp+6] ; second parameter to function
  5381. \c
  5382. \c ; some more code
  5383. \c
  5384. \c mov sp,bp ; undo "sub sp,0x40" above
  5385. \c pop bp
  5386. \c retf 4 ; total size of params is 4
  5387. At the other end of the process, to call a Pascal function from your
  5388. assembly code, you would do something like this:
  5389. \c extern SomeFunc
  5390. \c
  5391. \c ; and then, further down...
  5392. \c
  5393. \c push word seg mystring ; Now push the segment, and...
  5394. \c push word mystring ; ... offset of "mystring"
  5395. \c push word [myint] ; one of my variables
  5396. \c call far SomeFunc
  5397. This is equivalent to the Pascal code
  5398. \c procedure SomeFunc(String: PChar; Int: Integer);
  5399. \c SomeFunc(@mystring, myint);
  5400. \S{16bpseg} Borland Pascal \I{segment names, Borland Pascal}Segment
  5401. Name Restrictions
  5402. Since Borland Pascal's internal unit file format is completely
  5403. different from \c{OBJ}, it only makes a very sketchy job of actually
  5404. reading and understanding the various information contained in a
  5405. real \c{OBJ} file when it links that in. Therefore an object file
  5406. intended to be linked to a Pascal program must obey a number of
  5407. restrictions:
  5408. \b Procedures and functions must be in a segment whose name is
  5409. either \c{CODE}, \c{CSEG}, or something ending in \c{_TEXT}.
  5410. \b initialized data must be in a segment whose name is either
  5411. \c{CONST} or something ending in \c{_DATA}.
  5412. \b Uninitialized data must be in a segment whose name is either
  5413. \c{DATA}, \c{DSEG}, or something ending in \c{_BSS}.
  5414. \b Any other segments in the object file are completely ignored.
  5415. \c{GROUP} directives and segment attributes are also ignored.
  5416. \S{16bpmacro} Using \i\c{c16.mac} With Pascal Programs
  5417. The \c{c16.mac} macro package, described in \k{16cmacro}, can also
  5418. be used to simplify writing functions to be called from Pascal
  5419. programs, if you code \I\c{PASCAL}\c{%define PASCAL}. This
  5420. definition ensures that functions are far (it implies
  5421. \i\c{FARCODE}), and also causes procedure return instructions to be
  5422. generated with an operand.
  5423. Defining \c{PASCAL} does not change the code which calculates the
  5424. argument offsets; you must declare your function's arguments in
  5425. reverse order. For example:
  5426. \c %define PASCAL
  5427. \c
  5428. \c proc _pascalproc
  5429. \c
  5430. \c %$j arg 4
  5431. \c %$i arg
  5432. \c mov ax,[bp + %$i]
  5433. \c mov bx,[bp + %$j]
  5434. \c mov es,[bp + %$j + 2]
  5435. \c add ax,[bx]
  5436. \c
  5437. \c endproc
  5438. This defines the same routine, conceptually, as the example in
  5439. \k{16cmacro}: it defines a function taking two arguments, an integer
  5440. and a pointer to an integer, which returns the sum of the integer
  5441. and the contents of the pointer. The only difference between this
  5442. code and the large-model C version is that \c{PASCAL} is defined
  5443. instead of \c{FARCODE}, and that the arguments are declared in
  5444. reverse order.
  5445. \C{32bit} Writing 32-bit Code (Unix, Win32, DJGPP)
  5446. This chapter attempts to cover some of the common issues involved
  5447. when writing 32-bit code, to run under \i{Win32} or Unix, or to be
  5448. linked with C code generated by a Unix-style C compiler such as
  5449. \i{DJGPP}. It covers how to write assembly code to interface with
  5450. 32-bit C routines, and how to write position-independent code for
  5451. shared libraries.
  5452. Almost all 32-bit code, and in particular all code running under
  5453. \c{Win32}, \c{DJGPP} or any of the PC Unix variants, runs in \I{flat
  5454. memory model}\e{flat} memory model. This means that the segment registers
  5455. and paging have already been set up to give you the same 32-bit 4Gb
  5456. address space no matter what segment you work relative to, and that
  5457. you should ignore all segment registers completely. When writing
  5458. flat-model application code, you never need to use a segment
  5459. override or modify any segment register, and the code-section
  5460. addresses you pass to \c{CALL} and \c{JMP} live in the same address
  5461. space as the data-section addresses you access your variables by and
  5462. the stack-section addresses you access local variables and procedure
  5463. parameters by. Every address is 32 bits long and contains only an
  5464. offset part.
  5465. \H{32c} Interfacing to 32-bit C Programs
  5466. A lot of the discussion in \k{16c}, about interfacing to 16-bit C
  5467. programs, still applies when working in 32 bits. The absence of
  5468. memory models or segmentation worries simplifies things a lot.
  5469. \S{32cunder} External Symbol Names
  5470. Most 32-bit C compilers share the convention used by 16-bit
  5471. compilers, that the names of all global symbols (functions or data)
  5472. they define are formed by prefixing an underscore to the name as it
  5473. appears in the C program. However, not all of them do: the \c{ELF}
  5474. specification states that C symbols do \e{not} have a leading
  5475. underscore on their assembly-language names.
  5476. The older Linux \c{a.out} C compiler, all \c{Win32} compilers,
  5477. \c{DJGPP}, and \c{NetBSD} and \c{FreeBSD}, all use the leading
  5478. underscore; for these compilers, the macros \c{cextern} and
  5479. \c{cglobal}, as given in \k{16cunder}, will still work. For \c{ELF},
  5480. though, the leading underscore should not be used.
  5481. See also \k{opt-pfix}.
  5482. \S{32cfunc} Function Definitions and Function Calls
  5483. \I{functions, C calling convention}The \i{C calling convention}
  5484. in 32-bit programs is as follows. In the following description,
  5485. the words \e{caller} and \e{callee} are used to denote
  5486. the function doing the calling and the function which gets called.
  5487. \b The caller pushes the function's parameters on the stack, one
  5488. after another, in reverse order (right to left, so that the first
  5489. argument specified to the function is pushed last).
  5490. \b The caller then executes a near \c{CALL} instruction to pass
  5491. control to the callee.
  5492. \b The callee receives control, and typically (although this is not
  5493. actually necessary, in functions which do not need to access their
  5494. parameters) starts by saving the value of \c{ESP} in \c{EBP} so as
  5495. to be able to use \c{EBP} as a base pointer to find its parameters
  5496. on the stack. However, the caller was probably doing this too, so
  5497. part of the calling convention states that \c{EBP} must be preserved
  5498. by any C function. Hence the callee, if it is going to set up
  5499. \c{EBP} as a \i{frame pointer}, must push the previous value first.
  5500. \b The callee may then access its parameters relative to \c{EBP}.
  5501. The doubleword at \c{[EBP]} holds the previous value of \c{EBP} as
  5502. it was pushed; the next doubleword, at \c{[EBP+4]}, holds the return
  5503. address, pushed implicitly by \c{CALL}. The parameters start after
  5504. that, at \c{[EBP+8]}. The leftmost parameter of the function, since
  5505. it was pushed last, is accessible at this offset from \c{EBP}; the
  5506. others follow, at successively greater offsets. Thus, in a function
  5507. such as \c{printf} which takes a variable number of parameters, the
  5508. pushing of the parameters in reverse order means that the function
  5509. knows where to find its first parameter, which tells it the number
  5510. and type of the remaining ones.
  5511. \b The callee may also wish to decrease \c{ESP} further, so as to
  5512. allocate space on the stack for local variables, which will then be
  5513. accessible at negative offsets from \c{EBP}.
  5514. \b The callee, if it wishes to return a value to the caller, should
  5515. leave the value in \c{AL}, \c{AX} or \c{EAX} depending on the size
  5516. of the value. Floating-point results are typically returned in
  5517. \c{ST0}.
  5518. \b Once the callee has finished processing, it restores \c{ESP} from
  5519. \c{EBP} if it had allocated local stack space, then pops the previous
  5520. value of \c{EBP}, and returns via \c{RET} (equivalently, \c{RETN}).
  5521. \b When the caller regains control from the callee, the function
  5522. parameters are still on the stack, so it typically adds an immediate
  5523. constant to \c{ESP} to remove them (instead of executing a number of
  5524. slow \c{POP} instructions). Thus, if a function is accidentally
  5525. called with the wrong number of parameters due to a prototype
  5526. mismatch, the stack will still be returned to a sensible state since
  5527. the caller, which \e{knows} how many parameters it pushed, does the
  5528. removing.
  5529. There is an alternative calling convention used by Win32 programs
  5530. for Windows API calls, and also for functions called \e{by} the
  5531. Windows API such as window procedures: they follow what Microsoft
  5532. calls the \c{__stdcall} convention. This is slightly closer to the
  5533. Pascal convention, in that the callee clears the stack by passing a
  5534. parameter to the \c{RET} instruction. However, the parameters are
  5535. still pushed in right-to-left order.
  5536. Thus, you would define a function in C style in the following way:
  5537. \c global _myfunc
  5538. \c
  5539. \c _myfunc:
  5540. \c push ebp
  5541. \c mov ebp,esp
  5542. \c sub esp,0x40 ; 64 bytes of local stack space
  5543. \c mov ebx,[ebp+8] ; first parameter to function
  5544. \c
  5545. \c ; some more code
  5546. \c
  5547. \c leave ; mov esp,ebp / pop ebp
  5548. \c ret
  5549. At the other end of the process, to call a C function from your
  5550. assembly code, you would do something like this:
  5551. \c extern _printf
  5552. \c
  5553. \c ; and then, further down...
  5554. \c
  5555. \c push dword [myint] ; one of my integer variables
  5556. \c push dword mystring ; pointer into my data segment
  5557. \c call _printf
  5558. \c add esp,byte 8 ; `byte' saves space
  5559. \c
  5560. \c ; then those data items...
  5561. \c
  5562. \c segment _DATA
  5563. \c
  5564. \c myint dd 1234
  5565. \c mystring db 'This number -> %d <- should be 1234',10,0
  5566. This piece of code is the assembly equivalent of the C code
  5567. \c int myint = 1234;
  5568. \c printf("This number -> %d <- should be 1234\n", myint);
  5569. \S{32cdata} Accessing Data Items
  5570. To get at the contents of C variables, or to declare variables which
  5571. C can access, you need only declare the names as \c{GLOBAL} or
  5572. \c{EXTERN}. (Again, the names require leading underscores, as stated
  5573. in \k{32cunder}.) Thus, a C variable declared as \c{int i} can be
  5574. accessed from assembler as
  5575. \c extern _i
  5576. \c mov eax,[_i]
  5577. And to declare your own integer variable which C programs can access
  5578. as \c{extern int j}, you do this (making sure you are assembling in
  5579. the \c{_DATA} segment, if necessary):
  5580. \c global _j
  5581. \c _j dd 0
  5582. To access a C array, you need to know the size of the components of
  5583. the array. For example, \c{int} variables are four bytes long, so if
  5584. a C program declares an array as \c{int a[10]}, you can access
  5585. \c{a[3]} by coding \c{mov ax,[_a+12]}. (The byte offset 12 is obtained
  5586. by multiplying the desired array index, 3, by the size of the array
  5587. element, 4.) The sizes of the C base types in 32-bit compilers are:
  5588. 1 for \c{char}, 2 for \c{short}, 4 for \c{int}, \c{long} and
  5589. \c{float}, and 8 for \c{double}. Pointers, being 32-bit addresses,
  5590. are also 4 bytes long.
  5591. To access a C \i{data structure}, you need to know the offset from
  5592. the base of the structure to the field you are interested in. You
  5593. can either do this by converting the C structure definition into a
  5594. NASM structure definition (using \c{STRUC}), or by calculating the
  5595. one offset and using just that.
  5596. To do either of these, you should read your C compiler's manual to
  5597. find out how it organizes data structures. NASM gives no special
  5598. alignment to structure members in its own \i\c{STRUC} macro, so you
  5599. have to specify alignment yourself if the C compiler generates it.
  5600. Typically, you might find that a structure like
  5601. \c struct {
  5602. \c char c;
  5603. \c int i;
  5604. \c } foo;
  5605. might be eight bytes long rather than five, since the \c{int} field
  5606. would be aligned to a four-byte boundary. However, this sort of
  5607. feature is sometimes a configurable option in the C compiler, either
  5608. using command-line options or \c{#pragma} lines, so you have to find
  5609. out how your own compiler does it.
  5610. \S{32cmacro} \i\c{c32.mac}: Helper Macros for the 32-bit C Interface
  5611. Included in the NASM archives, in the \I{misc directory}\c{misc}
  5612. directory, is a file \c{c32.mac} of macros. It defines three macros:
  5613. \i\c{proc}, \i\c{arg} and \i\c{endproc}. These are intended to be
  5614. used for C-style procedure definitions, and they automate a lot of
  5615. the work involved in keeping track of the calling convention.
  5616. An example of an assembly function using the macro set is given
  5617. here:
  5618. \c proc _proc32
  5619. \c
  5620. \c %$i arg
  5621. \c %$j arg
  5622. \c mov eax,[ebp + %$i]
  5623. \c mov ebx,[ebp + %$j]
  5624. \c add eax,[ebx]
  5625. \c
  5626. \c endproc
  5627. This defines \c{_proc32} to be a procedure taking two arguments, the
  5628. first (\c{i}) an integer and the second (\c{j}) a pointer to an
  5629. integer. It returns \c{i + *j}.
  5630. Note that the \c{arg} macro has an \c{EQU} as the first line of its
  5631. expansion, and since the label before the macro call gets prepended
  5632. to the first line of the expanded macro, the \c{EQU} works, defining
  5633. \c{%$i} to be an offset from \c{BP}. A context-local variable is
  5634. used, local to the context pushed by the \c{proc} macro and popped
  5635. by the \c{endproc} macro, so that the same argument name can be used
  5636. in later procedures. Of course, you don't \e{have} to do that.
  5637. \c{arg} can take an optional parameter, giving the size of the
  5638. argument. If no size is given, 4 is assumed, since it is likely that
  5639. many function parameters will be of type \c{int} or pointers.
  5640. \H{picdll} Writing NetBSD/FreeBSD/OpenBSD and Linux/ELF \i{Shared
  5641. Libraries}
  5642. \c{ELF} replaced the older \c{a.out} object file format under Linux
  5643. because it contains support for \i{position-independent code}
  5644. (\i{PIC}), which makes writing shared libraries much easier. NASM
  5645. supports the \c{ELF} position-independent code features, so you can
  5646. write Linux \c{ELF} shared libraries in NASM.
  5647. \i{NetBSD}, and its close cousins \i{FreeBSD} and \i{OpenBSD}, take
  5648. a different approach by hacking PIC support into the \c{a.out}
  5649. format. NASM supports this as the \i\c{aoutb} output format, so you
  5650. can write \i{BSD} shared libraries in NASM too.
  5651. The operating system loads a PIC shared library by memory-mapping
  5652. the library file at an arbitrarily chosen point in the address space
  5653. of the running process. The contents of the library's code section
  5654. must therefore not depend on where it is loaded in memory.
  5655. Therefore, you cannot get at your variables by writing code like
  5656. this:
  5657. \c mov eax,[myvar] ; WRONG
  5658. Instead, the linker provides an area of memory called the
  5659. \i\e{global offset table}, or \i{GOT}; the GOT is situated at a
  5660. constant distance from your library's code, so if you can find out
  5661. where your library is loaded (which is typically done using a
  5662. \c{CALL} and \c{POP} combination), you can obtain the address of the
  5663. GOT, and you can then load the addresses of your variables out of
  5664. linker-generated entries in the GOT.
  5665. The \e{data} section of a PIC shared library does not have these
  5666. restrictions: since the data section is writable, it has to be
  5667. copied into memory anyway rather than just paged in from the library
  5668. file, so as long as it's being copied it can be relocated too. So
  5669. you can put ordinary types of relocation in the data section without
  5670. too much worry (but see \k{picglobal} for a caveat).
  5671. \S{picgot} Obtaining the Address of the GOT
  5672. Each code module in your shared library should define the GOT as an
  5673. external symbol:
  5674. \c extern _GLOBAL_OFFSET_TABLE_ ; in ELF
  5675. \c extern __GLOBAL_OFFSET_TABLE_ ; in BSD a.out
  5676. At the beginning of any function in your shared library which plans
  5677. to access your data or BSS sections, you must first calculate the
  5678. address of the GOT. This is typically done by writing the function
  5679. in this form:
  5680. \c func: push ebp
  5681. \c mov ebp,esp
  5682. \c push ebx
  5683. \c call .get_GOT
  5684. \c .get_GOT:
  5685. \c pop ebx
  5686. \c add ebx,_GLOBAL_OFFSET_TABLE_+$$-.get_GOT wrt ..gotpc
  5687. \c
  5688. \c ; the function body comes here
  5689. \c
  5690. \c mov ebx,[ebp-4]
  5691. \c mov esp,ebp
  5692. \c pop ebp
  5693. \c ret
  5694. (For BSD, again, the symbol \c{_GLOBAL_OFFSET_TABLE} requires a
  5695. second leading underscore.)
  5696. The first two lines of this function are simply the standard C
  5697. prologue to set up a stack frame, and the last three lines are
  5698. standard C function epilogue. The third line, and the fourth to last
  5699. line, save and restore the \c{EBX} register, because PIC shared
  5700. libraries use this register to store the address of the GOT.
  5701. The interesting bit is the \c{CALL} instruction and the following
  5702. two lines. The \c{CALL} and \c{POP} combination obtains the address
  5703. of the label \c{.get_GOT}, without having to know in advance where
  5704. the program was loaded (since the \c{CALL} instruction is encoded
  5705. relative to the current position). The \c{ADD} instruction makes use
  5706. of one of the special PIC relocation types: \i{GOTPC relocation}.
  5707. With the \i\c{WRT ..gotpc} qualifier specified, the symbol
  5708. referenced (here \c{_GLOBAL_OFFSET_TABLE_}, the special symbol
  5709. assigned to the GOT) is given as an offset from the beginning of the
  5710. section. (Actually, \c{ELF} encodes it as the offset from the operand
  5711. field of the \c{ADD} instruction, but NASM simplifies this
  5712. deliberately, so you do things the same way for both \c{ELF} and
  5713. \c{BSD}.) So the instruction then \e{adds} the beginning of the section,
  5714. to get the real address of the GOT, and subtracts the value of
  5715. \c{.get_GOT} which it knows is in \c{EBX}. Therefore, by the time
  5716. that instruction has finished, \c{EBX} contains the address of the GOT.
  5717. If you didn't follow that, don't worry: it's never necessary to
  5718. obtain the address of the GOT by any other means, so you can put
  5719. those three instructions into a macro and safely ignore them:
  5720. \c %macro get_GOT 0
  5721. \c
  5722. \c call %%getgot
  5723. \c %%getgot:
  5724. \c pop ebx
  5725. \c add ebx,_GLOBAL_OFFSET_TABLE_+$$-%%getgot wrt ..gotpc
  5726. \c
  5727. \c %endmacro
  5728. \S{piclocal} Finding Your Local Data Items
  5729. Having got the GOT, you can then use it to obtain the addresses of
  5730. your data items. Most variables will reside in the sections you have
  5731. declared; they can be accessed using the \I{GOTOFF
  5732. relocation}\c{..gotoff} special \I\c{WRT ..gotoff}\c{WRT} type. The
  5733. way this works is like this:
  5734. \c lea eax,[ebx+myvar wrt ..gotoff]
  5735. The expression \c{myvar wrt ..gotoff} is calculated, when the shared
  5736. library is linked, to be the offset to the local variable \c{myvar}
  5737. from the beginning of the GOT. Therefore, adding it to \c{EBX} as
  5738. above will place the real address of \c{myvar} in \c{EAX}.
  5739. If you declare variables as \c{GLOBAL} without specifying a size for
  5740. them, they are shared between code modules in the library, but do
  5741. not get exported from the library to the program that loaded it.
  5742. They will still be in your ordinary data and BSS sections, so you
  5743. can access them in the same way as local variables, using the above
  5744. \c{..gotoff} mechanism.
  5745. Note that due to a peculiarity of the way BSD \c{a.out} format
  5746. handles this relocation type, there must be at least one non-local
  5747. symbol in the same section as the address you're trying to access.
  5748. \S{picextern} Finding External and Common Data Items
  5749. If your library needs to get at an external variable (external to
  5750. the \e{library}, not just to one of the modules within it), you must
  5751. use the \I{GOT relocations}\I\c{WRT ..got}\c{..got} type to get at
  5752. it. The \c{..got} type, instead of giving you the offset from the
  5753. GOT base to the variable, gives you the offset from the GOT base to
  5754. a GOT \e{entry} containing the address of the variable. The linker
  5755. will set up this GOT entry when it builds the library, and the
  5756. dynamic linker will place the correct address in it at load time. So
  5757. to obtain the address of an external variable \c{extvar} in \c{EAX},
  5758. you would code
  5759. \c mov eax,[ebx+extvar wrt ..got]
  5760. This loads the address of \c{extvar} out of an entry in the GOT. The
  5761. linker, when it builds the shared library, collects together every
  5762. relocation of type \c{..got}, and builds the GOT so as to ensure it
  5763. has every necessary entry present.
  5764. Common variables must also be accessed in this way.
  5765. \S{picglobal} Exporting Symbols to the Library User
  5766. If you want to export symbols to the user of the library, you have
  5767. to declare whether they are functions or data, and if they are data,
  5768. you have to give the size of the data item. This is because the
  5769. dynamic linker has to build \I{PLT}\i{procedure linkage table}
  5770. entries for any exported functions, and also moves exported data
  5771. items away from the library's data section in which they were
  5772. declared.
  5773. So to export a function to users of the library, you must use
  5774. \c global func:function ; declare it as a function
  5775. \c
  5776. \c func: push ebp
  5777. \c
  5778. \c ; etc.
  5779. And to export a data item such as an array, you would have to code
  5780. \c global array:data array.end-array ; give the size too
  5781. \c
  5782. \c array: resd 128
  5783. \c .end:
  5784. Be careful: If you export a variable to the library user, by
  5785. declaring it as \c{GLOBAL} and supplying a size, the variable will
  5786. end up living in the data section of the main program, rather than
  5787. in your library's data section, where you declared it. So you will
  5788. have to access your own global variable with the \c{..got} mechanism
  5789. rather than \c{..gotoff}, as if it were external (which,
  5790. effectively, it has become).
  5791. Equally, if you need to store the address of an exported global in
  5792. one of your data sections, you can't do it by means of the standard
  5793. sort of code:
  5794. \c dataptr: dd global_data_item ; WRONG
  5795. NASM will interpret this code as an ordinary relocation, in which
  5796. \c{global_data_item} is merely an offset from the beginning of the
  5797. \c{.data} section (or whatever); so this reference will end up
  5798. pointing at your data section instead of at the exported global
  5799. which resides elsewhere.
  5800. Instead of the above code, then, you must write
  5801. \c dataptr: dd global_data_item wrt ..sym
  5802. which makes use of the special \c{WRT} type \I\c{WRT ..sym}\c{..sym}
  5803. to instruct NASM to search the symbol table for a particular symbol
  5804. at that address, rather than just relocating by section base.
  5805. Either method will work for functions: referring to one of your
  5806. functions by means of
  5807. \c funcptr: dd my_function
  5808. will give the user the address of the code you wrote, whereas
  5809. \c funcptr: dd my_function wrt ..sym
  5810. will give the address of the procedure linkage table for the
  5811. function, which is where the calling program will \e{believe} the
  5812. function lives. Either address is a valid way to call the function.
  5813. \S{picproc} Calling Procedures Outside the Library
  5814. Calling procedures outside your shared library has to be done by
  5815. means of a \i\e{procedure linkage table}, or \i{PLT}. The PLT is
  5816. placed at a known offset from where the library is loaded, so the
  5817. library code can make calls to the PLT in a position-independent
  5818. way. Within the PLT there is code to jump to offsets contained in
  5819. the GOT, so function calls to other shared libraries or to routines
  5820. in the main program can be transparently passed off to their real
  5821. destinations.
  5822. To call an external routine, you must use another special PIC
  5823. relocation type, \I{PLT relocations}\i\c{WRT ..plt}. This is much
  5824. easier than the GOT-based ones: you simply replace calls such as
  5825. \c{CALL printf} with the PLT-relative version \c{CALL printf WRT
  5826. ..plt}.
  5827. \S{link} Generating the Library File
  5828. Having written some code modules and assembled them to \c{.o} files,
  5829. you then generate your shared library with a command such as
  5830. \c ld -shared -o library.so module1.o module2.o # for ELF
  5831. \c ld -Bshareable -o library.so module1.o module2.o # for BSD
  5832. For ELF, if your shared library is going to reside in system
  5833. directories such as \c{/usr/lib} or \c{/lib}, it is usually worth
  5834. using the \i\c{-soname} flag to the linker, to store the final
  5835. library file name, with a version number, into the library:
  5836. \c ld -shared -soname library.so.1 -o library.so.1.2 *.o
  5837. You would then copy \c{library.so.1.2} into the library directory,
  5838. and create \c{library.so.1} as a symbolic link to it.
  5839. \C{mixsize} Mixing 16 and 32 Bit Code
  5840. This chapter tries to cover some of the issues, largely related to
  5841. unusual forms of addressing and jump instructions, encountered when
  5842. writing operating system code such as protected-mode initialisation
  5843. routines, which require code that operates in mixed segment sizes,
  5844. such as code in a 16-bit segment trying to modify data in a 32-bit
  5845. one, or jumps between different-size segments.
  5846. \H{mixjump} Mixed-Size Jumps\I{jumps, mixed-size}
  5847. \I{operating system, writing}\I{writing operating systems}The most
  5848. common form of \i{mixed-size instruction} is the one used when
  5849. writing a 32-bit OS: having done your setup in 16-bit mode, such as
  5850. loading the kernel, you then have to boot it by switching into
  5851. protected mode and jumping to the 32-bit kernel start address. In a
  5852. fully 32-bit OS, this tends to be the \e{only} mixed-size
  5853. instruction you need, since everything before it can be done in pure
  5854. 16-bit code, and everything after it can be pure 32-bit.
  5855. This jump must specify a 48-bit far address, since the target
  5856. segment is a 32-bit one. However, it must be assembled in a 16-bit
  5857. segment, so just coding, for example,
  5858. \c jmp 0x1234:0x56789ABC ; wrong!
  5859. will not work, since the offset part of the address will be
  5860. truncated to \c{0x9ABC} and the jump will be an ordinary 16-bit far
  5861. one.
  5862. The Linux kernel setup code gets round the inability of \c{as86} to
  5863. generate the required instruction by coding it manually, using
  5864. \c{DB} instructions. NASM can go one better than that, by actually
  5865. generating the right instruction itself. Here's how to do it right:
  5866. \c jmp dword 0x1234:0x56789ABC ; right
  5867. \I\c{JMP DWORD}The \c{DWORD} prefix (strictly speaking, it should
  5868. come \e{after} the colon, since it is declaring the \e{offset} field
  5869. to be a doubleword; but NASM will accept either form, since both are
  5870. unambiguous) forces the offset part to be treated as far, in the
  5871. assumption that you are deliberately writing a jump from a 16-bit
  5872. segment to a 32-bit one.
  5873. You can do the reverse operation, jumping from a 32-bit segment to a
  5874. 16-bit one, by means of the \c{WORD} prefix:
  5875. \c jmp word 0x8765:0x4321 ; 32 to 16 bit
  5876. If the \c{WORD} prefix is specified in 16-bit mode, or the \c{DWORD}
  5877. prefix in 32-bit mode, they will be ignored, since each is
  5878. explicitly forcing NASM into a mode it was in anyway.
  5879. \H{mixaddr} Addressing Between Different-Size Segments\I{addressing,
  5880. mixed-size}\I{mixed-size addressing}
  5881. If your OS is mixed 16 and 32-bit, or if you are writing a DOS
  5882. extender, you are likely to have to deal with some 16-bit segments
  5883. and some 32-bit ones. At some point, you will probably end up
  5884. writing code in a 16-bit segment which has to access data in a
  5885. 32-bit segment, or vice versa.
  5886. If the data you are trying to access in a 32-bit segment lies within
  5887. the first 64K of the segment, you may be able to get away with using
  5888. an ordinary 16-bit addressing operation for the purpose; but sooner
  5889. or later, you will want to do 32-bit addressing from 16-bit mode.
  5890. The easiest way to do this is to make sure you use a register for
  5891. the address, since any effective address containing a 32-bit
  5892. register is forced to be a 32-bit address. So you can do
  5893. \c mov eax,offset_into_32_bit_segment_specified_by_fs
  5894. \c mov dword [fs:eax],0x11223344
  5895. This is fine, but slightly cumbersome (since it wastes an
  5896. instruction and a register) if you already know the precise offset
  5897. you are aiming at. The x86 architecture does allow 32-bit effective
  5898. addresses to specify nothing but a 4-byte offset, so why shouldn't
  5899. NASM be able to generate the best instruction for the purpose?
  5900. It can. As in \k{mixjump}, you need only prefix the address with the
  5901. \c{DWORD} keyword, and it will be forced to be a 32-bit address:
  5902. \c mov dword [fs:dword my_offset],0x11223344
  5903. Also as in \k{mixjump}, NASM is not fussy about whether the
  5904. \c{DWORD} prefix comes before or after the segment override, so
  5905. arguably a nicer-looking way to code the above instruction is
  5906. \c mov dword [dword fs:my_offset],0x11223344
  5907. Don't confuse the \c{DWORD} prefix \e{outside} the square brackets,
  5908. which controls the size of the data stored at the address, with the
  5909. one \c{inside} the square brackets which controls the length of the
  5910. address itself. The two can quite easily be different:
  5911. \c mov word [dword 0x12345678],0x9ABC
  5912. This moves 16 bits of data to an address specified by a 32-bit
  5913. offset.
  5914. You can also specify \c{WORD} or \c{DWORD} prefixes along with the
  5915. \c{FAR} prefix to indirect far jumps or calls. For example:
  5916. \c call dword far [fs:word 0x4321]
  5917. This instruction contains an address specified by a 16-bit offset;
  5918. it loads a 48-bit far pointer from that (16-bit segment and 32-bit
  5919. offset), and calls that address.
  5920. \H{mixother} Other Mixed-Size Instructions
  5921. The other way you might want to access data might be using the
  5922. string instructions (\c{LODSx}, \c{STOSx} and so on) or the
  5923. \c{XLATB} instruction. These instructions, since they take no
  5924. parameters, might seem to have no easy way to make them perform
  5925. 32-bit addressing when assembled in a 16-bit segment.
  5926. This is the purpose of NASM's \i\c{a16}, \i\c{a32} and \i\c{a64} prefixes. If
  5927. you are coding \c{LODSB} in a 16-bit segment but it is supposed to
  5928. be accessing a string in a 32-bit segment, you should load the
  5929. desired address into \c{ESI} and then code
  5930. \c a32 lodsb
  5931. The prefix forces the addressing size to 32 bits, meaning that
  5932. \c{LODSB} loads from \c{[DS:ESI]} instead of \c{[DS:SI]}. To access
  5933. a string in a 16-bit segment when coding in a 32-bit one, the
  5934. corresponding \c{a16} prefix can be used.
  5935. The \c{a16}, \c{a32} and \c{a64} prefixes can be applied to any instruction
  5936. in NASM's instruction table, but most of them can generate all the
  5937. useful forms without them. The prefixes are necessary only for
  5938. instructions with implicit addressing:
  5939. \# \c{CMPSx} (\k{insCMPSB}),
  5940. \# \c{SCASx} (\k{insSCASB}), \c{LODSx} (\k{insLODSB}), \c{STOSx}
  5941. \# (\k{insSTOSB}), \c{MOVSx} (\k{insMOVSB}), \c{INSx} (\k{insINSB}),
  5942. \# \c{OUTSx} (\k{insOUTSB}), and \c{XLATB} (\k{insXLATB}).
  5943. \c{CMPSx}, \c{SCASx}, \c{LODSx}, \c{STOSx}, \c{MOVSx}, \c{INSx},
  5944. \c{OUTSx}, and \c{XLATB}.
  5945. Also, the
  5946. various push and pop instructions (\c{PUSHA} and \c{POPF} as well as
  5947. the more usual \c{PUSH} and \c{POP}) can accept \c{a16}, \c{a32} or \c{a64}
  5948. prefixes to force a particular one of \c{SP}, \c{ESP} or \c{RSP} to be used
  5949. as a stack pointer, in case the stack segment in use is a different
  5950. size from the code segment.
  5951. \c{PUSH} and \c{POP}, when applied to segment registers in 32-bit
  5952. mode, also have the slightly odd behaviour that they push and pop 4
  5953. bytes at a time, of which the top two are ignored and the bottom two
  5954. give the value of the segment register being manipulated. To force
  5955. the 16-bit behaviour of segment-register push and pop instructions,
  5956. you can use the operand-size prefix \i\c{o16}:
  5957. \c o16 push ss
  5958. \c o16 push ds
  5959. This code saves a doubleword of stack space by fitting two segment
  5960. registers into the space which would normally be consumed by pushing
  5961. one.
  5962. (You can also use the \i\c{o32} prefix to force the 32-bit behaviour
  5963. when in 16-bit mode, but this seems less useful.)
  5964. \C{64bit} Writing 64-bit Code (Unix, Win64)
  5965. This chapter attempts to cover some of the common issues involved when
  5966. writing 64-bit code, to run under \i{Win64} or Unix. It covers how to
  5967. write assembly code to interface with 64-bit C routines, and how to
  5968. write position-independent code for shared libraries.
  5969. All 64-bit code uses a flat memory model, since segmentation is not
  5970. available in 64-bit mode. The one exception is the \c{FS} and \c{GS}
  5971. registers, which still add their bases.
  5972. Position independence in 64-bit mode is significantly simpler, since
  5973. the processor supports \c{RIP}-relative addressing directly; see the
  5974. \c{REL} keyword (\k{effaddr}). On most 64-bit platforms, it is
  5975. probably desirable to make that the default, using the directive
  5976. \c{DEFAULT REL} (\k{default}).
  5977. 64-bit programming is relatively similar to 32-bit programming, but
  5978. of course pointers are 64 bits long; additionally, all existing
  5979. platforms pass arguments in registers rather than on the stack.
  5980. Furthermore, 64-bit platforms use SSE2 by default for floating point.
  5981. Please see the ABI documentation for your platform.
  5982. 64-bit platforms differ in the sizes of the C/C++ fundamental
  5983. datatypes, not just from 32-bit platforms but from each other. If a
  5984. specific size data type is desired, it is probably best to use the
  5985. types defined in the standard C header \c{<inttypes.h>}.
  5986. All known 64-bit platforms except some embedded platforms require that
  5987. the stack is 16-byte aligned at the entry to a function. In order to
  5988. enforce that, the stack pointer (\c{RSP}) needs to be aligned on an
  5989. \c{odd} multiple of 8 bytes before the \c{CALL} instruction.
  5990. In 64-bit mode, the default instruction size is still 32 bits. When
  5991. loading a value into a 32-bit register (but not an 8- or 16-bit
  5992. register), the upper 32 bits of the corresponding 64-bit register are
  5993. set to zero.
  5994. \H{reg64} Register Names in 64-bit Mode
  5995. NASM uses the following names for general-purpose registers in 64-bit
  5996. mode, for 8-, 16-, 32- and 64-bit references, respectively:
  5997. \c AL/AH, CL/CH, DL/DH, BL/BH, SPL, BPL, SIL, DIL, R8B-R15B
  5998. \c AX, CX, DX, BX, SP, BP, SI, DI, R8W-R15W
  5999. \c EAX, ECX, EDX, EBX, ESP, EBP, ESI, EDI, R8D-R15D
  6000. \c RAX, RCX, RDX, RBX, RSP, RBP, RSI, RDI, R8-R15
  6001. This is consistent with the AMD documentation and most other
  6002. assemblers. The Intel documentation, however, uses the names
  6003. \c{R8L-R15L} for 8-bit references to the higher registers. It is
  6004. possible to use those names by definiting them as macros; similarly,
  6005. if one wants to use numeric names for the low 8 registers, define them
  6006. as macros. The standard macro package \c{altreg} (see \k{pkg_altreg})
  6007. can be used for this purpose.
  6008. \H{id64} Immediates and Displacements in 64-bit Mode
  6009. In 64-bit mode, immediates and displacements are generally only 32
  6010. bits wide. NASM will therefore truncate most displacements and
  6011. immediates to 32 bits.
  6012. The only instruction which takes a full \i{64-bit immediate} is:
  6013. \c MOV reg64,imm64
  6014. NASM will produce this instruction whenever the programmer uses
  6015. \c{MOV} with an immediate into a 64-bit register. If this is not
  6016. desirable, simply specify the equivalent 32-bit register, which will
  6017. be automatically zero-extended by the processor, or specify the
  6018. immediate as \c{DWORD}:
  6019. \c mov rax,foo ; 64-bit immediate
  6020. \c mov rax,qword foo ; (identical)
  6021. \c mov eax,foo ; 32-bit immediate, zero-extended
  6022. \c mov rax,dword foo ; 32-bit immediate, sign-extended
  6023. The length of these instructions are 10, 5 and 7 bytes, respectively.
  6024. If optimization is enabled and NASM can determine at assembly time
  6025. that a shorter instruction will suffice, the shorter instruction will
  6026. be emitted unless of course \c{STRICT QWORD} or \c{STRICT DWORD} is
  6027. specified (see \k{strict}):
  6028. \c mov rax,1 ; Assembles as "mov eax,1" (5 bytes)
  6029. \c mov rax,strict qword 1 ; Full 10-byte instruction
  6030. \c mov rax,strict dword 1 ; 7-byte instruction
  6031. \c mov rax,symbol ; 10 bytes, not known at assembly time
  6032. \c lea rax,[rel symbol] ; 7 bytes, usually preferred by the ABI
  6033. Note that \c{lea rax,[rel symbol]} is position-independent, whereas
  6034. \c{mov rax,symbol} is not. Most ABIs prefer or even require
  6035. position-independent code in 64-bit mode. However, the \c{MOV}
  6036. instruction is able to reference a symbol anywhere in the 64-bit
  6037. address space, whereas \c{LEA} is only able to access a symbol within
  6038. within 2 GB of the instruction itself (see below.)
  6039. The only instructions which take a full \I{64-bit displacement}64-bit
  6040. \e{displacement} is loading or storing, using \c{MOV}, \c{AL}, \c{AX},
  6041. \c{EAX} or \c{RAX} (but no other registers) to an absolute 64-bit address.
  6042. Since this is a relatively rarely used instruction (64-bit code generally uses
  6043. relative addressing), the programmer has to explicitly declare the
  6044. displacement size as \c{ABS QWORD}:
  6045. \c default abs
  6046. \c
  6047. \c mov eax,[foo] ; 32-bit absolute disp, sign-extended
  6048. \c mov eax,[a32 foo] ; 32-bit absolute disp, zero-extended
  6049. \c mov eax,[qword foo] ; 64-bit absolute disp
  6050. \c
  6051. \c default rel
  6052. \c
  6053. \c mov eax,[foo] ; 32-bit relative disp
  6054. \c mov eax,[a32 foo] ; d:o, address truncated to 32 bits(!)
  6055. \c mov eax,[qword foo] ; error
  6056. \c mov eax,[abs qword foo] ; 64-bit absolute disp
  6057. A sign-extended absolute displacement can access from -2 GB to +2 GB;
  6058. a zero-extended absolute displacement can access from 0 to 4 GB.
  6059. \H{unix64} Interfacing to 64-bit C Programs (Unix)
  6060. On Unix, the 64-bit ABI as well as the x32 ABI (32-bit ABI with the
  6061. CPU in 64-bit mode) is defined by the documents at:
  6062. \W{http://www.nasm.us/abi/unix64}\c{http://www.nasm.us/abi/unix64}
  6063. Although written for AT&T-syntax assembly, the concepts apply equally
  6064. well for NASM-style assembly. What follows is a simplified summary.
  6065. The first six integer arguments (from the left) are passed in \c{RDI},
  6066. \c{RSI}, \c{RDX}, \c{RCX}, \c{R8}, and \c{R9}, in that order.
  6067. Additional integer arguments are passed on the stack. These
  6068. registers, plus \c{RAX}, \c{R10} and \c{R11} are destroyed by function
  6069. calls, and thus are available for use by the function without saving.
  6070. Integer return values are passed in \c{RAX} and \c{RDX}, in that order.
  6071. Floating point is done using SSE registers, except for \c{long
  6072. double}, which is 80 bits (\c{TWORD}) on most platforms (Android is
  6073. one exception; there \c{long double} is 64 bits and treated the same
  6074. as \c{double}.) Floating-point arguments are passed in \c{XMM0} to
  6075. \c{XMM7}; return is \c{XMM0} and \c{XMM1}. \c{long double} are passed
  6076. on the stack, and returned in \c{ST0} and \c{ST1}.
  6077. All SSE and x87 registers are destroyed by function calls.
  6078. On 64-bit Unix, \c{long} is 64 bits.
  6079. Integer and SSE register arguments are counted separately, so for the case of
  6080. \c void foo(long a, double b, int c)
  6081. \c{a} is passed in \c{RDI}, \c{b} in \c{XMM0}, and \c{c} in \c{ESI}.
  6082. \H{win64} Interfacing to 64-bit C Programs (Win64)
  6083. The Win64 ABI is described by the document at:
  6084. \W{http://www.nasm.us/abi/win64}\c{http://www.nasm.us/abi/win64}
  6085. What follows is a simplified summary.
  6086. The first four integer arguments are passed in \c{RCX}, \c{RDX},
  6087. \c{R8} and \c{R9}, in that order. Additional integer arguments are
  6088. passed on the stack. These registers, plus \c{RAX}, \c{R10} and
  6089. \c{R11} are destroyed by function calls, and thus are available for
  6090. use by the function without saving.
  6091. Integer return values are passed in \c{RAX} only.
  6092. Floating point is done using SSE registers, except for \c{long
  6093. double}. Floating-point arguments are passed in \c{XMM0} to \c{XMM3};
  6094. return is \c{XMM0} only.
  6095. On Win64, \c{long} is 32 bits; \c{long long} or \c{_int64} is 64 bits.
  6096. Integer and SSE register arguments are counted together, so for the case of
  6097. \c void foo(long long a, double b, int c)
  6098. \c{a} is passed in \c{RCX}, \c{b} in \c{XMM1}, and \c{c} in \c{R8D}.
  6099. \C{trouble} Troubleshooting
  6100. This chapter describes some of the common problems that users have
  6101. been known to encounter with NASM, and answers them. If you think you
  6102. have found a bug in NASM, please see \k{bugs}.
  6103. \H{problems} Common Problems
  6104. \S{inefficient} NASM Generates \i{Inefficient Code}
  6105. We sometimes get `bug' reports about NASM generating inefficient, or
  6106. even `wrong', code on instructions such as \c{ADD ESP,8}. This is a
  6107. deliberate design feature, connected to predictability of output:
  6108. NASM, on seeing \c{ADD ESP,8}, will generate the form of the
  6109. instruction which leaves room for a 32-bit offset. You need to code
  6110. \I\c{BYTE}\c{ADD ESP,BYTE 8} if you want the space-efficient form of
  6111. the instruction. This isn't a bug, it's user error: if you prefer to
  6112. have NASM produce the more efficient code automatically enable
  6113. optimization with the \c{-O} option (see \k{opt-O}).
  6114. \S{jmprange} My Jumps are Out of Range\I{out of range, jumps}
  6115. Similarly, people complain that when they issue \i{conditional
  6116. jumps} (which are \c{SHORT} by default) that try to jump too far,
  6117. NASM reports `short jump out of range' instead of making the jumps
  6118. longer.
  6119. This, again, is partly a predictability issue, but in fact has a
  6120. more practical reason as well. NASM has no means of being told what
  6121. type of processor the code it is generating will be run on; so it
  6122. cannot decide for itself that it should generate \i\c{Jcc NEAR} type
  6123. instructions, because it doesn't know that it's working for a 386 or
  6124. above. Alternatively, it could replace the out-of-range short
  6125. \c{JNE} instruction with a very short \c{JE} instruction that jumps
  6126. over a \c{JMP NEAR}; this is a sensible solution for processors
  6127. below a 386, but hardly efficient on processors which have good
  6128. branch prediction \e{and} could have used \c{JNE NEAR} instead. So,
  6129. once again, it's up to the user, not the assembler, to decide what
  6130. instructions should be generated. See \k{opt-O}.
  6131. \S{proborg} \i\c{ORG} Doesn't Work
  6132. People writing \i{boot sector} programs in the \c{bin} format often
  6133. complain that \c{ORG} doesn't work the way they'd like: in order to
  6134. place the \c{0xAA55} signature word at the end of a 512-byte boot
  6135. sector, people who are used to MASM tend to code
  6136. \c ORG 0
  6137. \c
  6138. \c ; some boot sector code
  6139. \c
  6140. \c ORG 510
  6141. \c DW 0xAA55
  6142. This is not the intended use of the \c{ORG} directive in NASM, and
  6143. will not work. The correct way to solve this problem in NASM is to
  6144. use the \i\c{TIMES} directive, like this:
  6145. \c ORG 0
  6146. \c
  6147. \c ; some boot sector code
  6148. \c
  6149. \c TIMES 510-($-$$) DB 0
  6150. \c DW 0xAA55
  6151. The \c{TIMES} directive will insert exactly enough zero bytes into
  6152. the output to move the assembly point up to 510. This method also
  6153. has the advantage that if you accidentally fill your boot sector too
  6154. full, NASM will catch the problem at assembly time and report it, so
  6155. you won't end up with a boot sector that you have to disassemble to
  6156. find out what's wrong with it.
  6157. \S{probtimes} \i\c{TIMES} Doesn't Work
  6158. The other common problem with the above code is people who write the
  6159. \c{TIMES} line as
  6160. \c TIMES 510-$ DB 0
  6161. by reasoning that \c{$} should be a pure number, just like 510, so
  6162. the difference between them is also a pure number and can happily be
  6163. fed to \c{TIMES}.
  6164. NASM is a \e{modular} assembler: the various component parts are
  6165. designed to be easily separable for re-use, so they don't exchange
  6166. information unnecessarily. In consequence, the \c{bin} output
  6167. format, even though it has been told by the \c{ORG} directive that
  6168. the \c{.text} section should start at 0, does not pass that
  6169. information back to the expression evaluator. So from the
  6170. evaluator's point of view, \c{$} isn't a pure number: it's an offset
  6171. from a section base. Therefore the difference between \c{$} and 510
  6172. is also not a pure number, but involves a section base. Values
  6173. involving section bases cannot be passed as arguments to \c{TIMES}.
  6174. The solution, as in the previous section, is to code the \c{TIMES}
  6175. line in the form
  6176. \c TIMES 510-($-$$) DB 0
  6177. in which \c{$} and \c{$$} are offsets from the same section base,
  6178. and so their difference is a pure number. This will solve the
  6179. problem and generate sensible code.
  6180. \A{ndisasm} \i{Ndisasm}
  6181. The Netwide Disassembler, NDISASM
  6182. \H{ndisintro} Introduction
  6183. The Netwide Disassembler is a small companion program to the Netwide
  6184. Assembler, NASM. It seemed a shame to have an x86 assembler,
  6185. complete with a full instruction table, and not make as much use of
  6186. it as possible, so here's a disassembler which shares the
  6187. instruction table (and some other bits of code) with NASM.
  6188. The Netwide Disassembler does nothing except to produce
  6189. disassemblies of \e{binary} source files. NDISASM does not have any
  6190. understanding of object file formats, like \c{objdump}, and it will
  6191. not understand \c{DOS .EXE} files like \c{debug} will. It just
  6192. disassembles.
  6193. \H{ndisrun} Running NDISASM
  6194. To disassemble a file, you will typically use a command of the form
  6195. \c ndisasm -b {16|32|64} filename
  6196. NDISASM can disassemble 16-, 32- or 64-bit code equally easily,
  6197. provided of course that you remember to specify which it is to work
  6198. with. If no \i\c{-b} switch is present, NDISASM works in 16-bit mode
  6199. by default. The \i\c{-u} switch (for USE32) also invokes 32-bit mode.
  6200. Two more command line options are \i\c{-r} which reports the version
  6201. number of NDISASM you are running, and \i\c{-h} which gives a short
  6202. summary of command line options.
  6203. \S{ndiscom} COM Files: Specifying an Origin
  6204. To disassemble a \c{DOS .COM} file correctly, a disassembler must assume
  6205. that the first instruction in the file is loaded at address \c{0x100},
  6206. rather than at zero. NDISASM, which assumes by default that any file
  6207. you give it is loaded at zero, will therefore need to be informed of
  6208. this.
  6209. The \i\c{-o} option allows you to declare a different origin for the
  6210. file you are disassembling. Its argument may be expressed in any of
  6211. the NASM numeric formats: decimal by default, if it begins with `\c{$}'
  6212. or `\c{0x}' or ends in `\c{H}' it's \c{hex}, if it ends in `\c{Q}' it's
  6213. \c{octal}, and if it ends in `\c{B}' it's \c{binary}.
  6214. Hence, to disassemble a \c{.COM} file:
  6215. \c ndisasm -o100h filename.com
  6216. will do the trick.
  6217. \S{ndissync} Code Following Data: Synchronisation
  6218. Suppose you are disassembling a file which contains some data which
  6219. isn't machine code, and \e{then} contains some machine code. NDISASM
  6220. will faithfully plough through the data section, producing machine
  6221. instructions wherever it can (although most of them will look
  6222. bizarre, and some may have unusual prefixes, e.g. `\c{FS OR AX,0x240A}'),
  6223. and generating `DB' instructions ever so often if it's totally stumped.
  6224. Then it will reach the code section.
  6225. Supposing NDISASM has just finished generating a strange machine
  6226. instruction from part of the data section, and its file position is
  6227. now one byte \e{before} the beginning of the code section. It's
  6228. entirely possible that another spurious instruction will get
  6229. generated, starting with the final byte of the data section, and
  6230. then the correct first instruction in the code section will not be
  6231. seen because the starting point skipped over it. This isn't really
  6232. ideal.
  6233. To avoid this, you can specify a `\i\c{synchronisation}' point, or indeed
  6234. as many synchronisation points as you like (although NDISASM can
  6235. only handle 2147483647 sync points internally). The definition of a sync
  6236. point is this: NDISASM guarantees to hit sync points exactly during
  6237. disassembly. If it is thinking about generating an instruction which
  6238. would cause it to jump over a sync point, it will discard that
  6239. instruction and output a `\c{db}' instead. So it \e{will} start
  6240. disassembly exactly from the sync point, and so you \e{will} see all
  6241. the instructions in your code section.
  6242. Sync points are specified using the \i\c{-s} option: they are measured
  6243. in terms of the program origin, not the file position. So if you
  6244. want to synchronize after 32 bytes of a \c{.COM} file, you would have to
  6245. do
  6246. \c ndisasm -o100h -s120h file.com
  6247. rather than
  6248. \c ndisasm -o100h -s20h file.com
  6249. As stated above, you can specify multiple sync markers if you need
  6250. to, just by repeating the \c{-s} option.
  6251. \S{ndisisync} Mixed Code and Data: Automatic (Intelligent) Synchronisation
  6252. \I\c{auto-sync}
  6253. Suppose you are disassembling the boot sector of a \c{DOS} floppy (maybe
  6254. it has a virus, and you need to understand the virus so that you
  6255. know what kinds of damage it might have done you). Typically, this
  6256. will contain a \c{JMP} instruction, then some data, then the rest of the
  6257. code. So there is a very good chance of NDISASM being \e{misaligned}
  6258. when the data ends and the code begins. Hence a sync point is
  6259. needed.
  6260. On the other hand, why should you have to specify the sync point
  6261. manually? What you'd do in order to find where the sync point would
  6262. be, surely, would be to read the \c{JMP} instruction, and then to use
  6263. its target address as a sync point. So can NDISASM do that for you?
  6264. The answer, of course, is yes: using either of the synonymous
  6265. switches \i\c{-a} (for automatic sync) or \i\c{-i} (for intelligent
  6266. sync) will enable \c{auto-sync} mode. Auto-sync mode automatically
  6267. generates a sync point for any forward-referring PC-relative jump or
  6268. call instruction that NDISASM encounters. (Since NDISASM is one-pass,
  6269. if it encounters a PC-relative jump whose target has already been
  6270. processed, there isn't much it can do about it...)
  6271. Only PC-relative jumps are processed, since an absolute jump is
  6272. either through a register (in which case NDISASM doesn't know what
  6273. the register contains) or involves a segment address (in which case
  6274. the target code isn't in the same segment that NDISASM is working
  6275. in, and so the sync point can't be placed anywhere useful).
  6276. For some kinds of file, this mechanism will automatically put sync
  6277. points in all the right places, and save you from having to place
  6278. any sync points manually. However, it should be stressed that
  6279. auto-sync mode is \e{not} guaranteed to catch all the sync points, and
  6280. you may still have to place some manually.
  6281. Auto-sync mode doesn't prevent you from declaring manual sync
  6282. points: it just adds automatically generated ones to the ones you
  6283. provide. It's perfectly feasible to specify \c{-i} \e{and} some \c{-s}
  6284. options.
  6285. Another caveat with auto-sync mode is that if, by some unpleasant
  6286. fluke, something in your data section should disassemble to a
  6287. PC-relative call or jump instruction, NDISASM may obediently place a
  6288. sync point in a totally random place, for example in the middle of
  6289. one of the instructions in your code section. So you may end up with
  6290. a wrong disassembly even if you use auto-sync. Again, there isn't
  6291. much I can do about this. If you have problems, you'll have to use
  6292. manual sync points, or use the \c{-k} option (documented below) to
  6293. suppress disassembly of the data area.
  6294. \S{ndisother} Other Options
  6295. The \i\c{-e} option skips a header on the file, by ignoring the first N
  6296. bytes. This means that the header is \e{not} counted towards the
  6297. disassembly offset: if you give \c{-e10 -o10}, disassembly will start
  6298. at byte 10 in the file, and this will be given offset 10, not 20.
  6299. The \i\c{-k} option is provided with two comma-separated numeric
  6300. arguments, the first of which is an assembly offset and the second
  6301. is a number of bytes to skip. This \e{will} count the skipped bytes
  6302. towards the assembly offset: its use is to suppress disassembly of a
  6303. data section which wouldn't contain anything you wanted to see
  6304. anyway.
  6305. \A{inslist} \i{Instruction List}
  6306. \H{inslistintro} Introduction
  6307. The following sections show the instructions which NASM currently supports. For each
  6308. instruction, there is a separate entry for each supported addressing mode. The third
  6309. column shows the processor type in which the instruction was introduced and,
  6310. when appropriate, one or more usage flags.
  6311. \& inslist.src
  6312. \A{changelog} \i{NASM Version History}
  6313. \& changes.src
  6314. \A{source} Building NASM from Source
  6315. The source code for NASM is available from our website,
  6316. \W{http://www.nasm.us/}{http://wwww.nasm.us/}, see \k{website}.
  6317. \H{tarball} Building from a Source Archive
  6318. The source archives available on the web site should be capable of
  6319. building on a number of platforms. This is the recommended method for
  6320. building NASM to support platforms for which executables are not
  6321. available.
  6322. On a system which has Unix shell (\c{sh}), run:
  6323. \c sh configure
  6324. \c make everything
  6325. A number of options can be passed to \c{configure}; see
  6326. \c{sh configure --help}.
  6327. A set of Makefiles for some other environments are also available;
  6328. please see the file \c{Mkfiles/README}.
  6329. To build the installer for the Windows platform, you will need the
  6330. \i\e{Nullsoft Scriptable Installer}, \i{NSIS}, installed.
  6331. To build the documentation, you will need a set of additional tools.
  6332. The documentation is not likely to be able to build on non-Unix
  6333. systems.
  6334. \H{git} Building from the \i\c{git} Repository
  6335. The NASM development tree is kept in a source code repository using
  6336. the \c{git} distributed source control system. The link is available
  6337. on the website. This is recommended only to participate in the
  6338. development of NASM or to assist with testing the development code.
  6339. To build NASM from the \c{git} repository you will need a Perl and, if
  6340. building on a Unix system, GNU autoconf.
  6341. To build on a Unix system, run:
  6342. \c sh autogen.sh
  6343. to create the \c{configure} script and then build as listed above.
  6344. \A{contact} Contact Information
  6345. \H{website} Website
  6346. NASM has a \i{website} at
  6347. \W{http://www.nasm.us/}\c{http://www.nasm.us/}.
  6348. \i{New releases}, \i{release candidates}, and \I{snapshots, daily
  6349. development}\i{daily development snapshots} of NASM are available from
  6350. the official web site in source form as well as binaries for a number
  6351. of common platforms.
  6352. \S{forums} User Forums
  6353. Users of NASM may find the Forums on the website useful. These are,
  6354. however, not frequented much by the developers of NASM, so they are
  6355. not suitable for reporting bugs.
  6356. \S{develcom} Development Community
  6357. The development of NASM is coordinated primarily though the
  6358. \i\c{nasm-devel} mailing list. If you wish to participate in
  6359. development of NASM, please join this mailing list. Subscription
  6360. links and archives of past posts are available on the website.
  6361. \H{bugs} \i{Reporting Bugs}\I{bugs}
  6362. To report bugs in NASM, please use the \i{bug tracker} at
  6363. \W{http://www.nasm.us/}\c{http://www.nasm.us/} (click on "Bug
  6364. Tracker"), or if that fails then through one of the contacts in
  6365. \k{website}.
  6366. Please read \k{qstart} first, and don't report the bug if it's
  6367. listed in there as a deliberate feature. (If you think the feature
  6368. is badly thought out, feel free to send us reasons why you think it
  6369. should be changed, but don't just send us mail saying `This is a
  6370. bug' if the documentation says we did it on purpose.) Then read
  6371. \k{problems}, and don't bother reporting the bug if it's listed
  6372. there.
  6373. If you do report a bug, \e{please} make sure your bug report includes
  6374. the following information:
  6375. \b What operating system you're running NASM under. Linux,
  6376. FreeBSD, NetBSD, MacOS X, Win16, Win32, Win64, MS-DOS, OS/2, VMS,
  6377. whatever.
  6378. \b If you compiled your own executable from a source archive, compiled
  6379. your own executable from \c{git}, used the standard distribution
  6380. binaries from the website, or got an executable from somewhere else
  6381. (e.g. a Linux distribution.) If you were using a locally built
  6382. executable, try to reproduce the problem using one of the standard
  6383. binaries, as this will make it easier for us to reproduce your problem
  6384. prior to fixing it.
  6385. \b Which version of NASM you're using, and exactly how you invoked
  6386. it. Give us the precise command line, and the contents of the
  6387. \c{NASMENV} environment variable if any.
  6388. \b Which versions of any supplementary programs you're using, and
  6389. how you invoked them. If the problem only becomes visible at link
  6390. time, tell us what linker you're using, what version of it you've
  6391. got, and the exact linker command line. If the problem involves
  6392. linking against object files generated by a compiler, tell us what
  6393. compiler, what version, and what command line or options you used.
  6394. (If you're compiling in an IDE, please try to reproduce the problem
  6395. with the command-line version of the compiler.)
  6396. \b If at all possible, send us a NASM source file which exhibits the
  6397. problem. If this causes copyright problems (e.g. you can only
  6398. reproduce the bug in restricted-distribution code) then bear in mind
  6399. the following two points: firstly, we guarantee that any source code
  6400. sent to us for the purposes of debugging NASM will be used \e{only}
  6401. for the purposes of debugging NASM, and that we will delete all our
  6402. copies of it as soon as we have found and fixed the bug or bugs in
  6403. question; and secondly, we would prefer \e{not} to be mailed large
  6404. chunks of code anyway. The smaller the file, the better. A
  6405. three-line sample file that does nothing useful \e{except}
  6406. demonstrate the problem is much easier to work with than a
  6407. fully fledged ten-thousand-line program. (Of course, some errors
  6408. \e{do} only crop up in large files, so this may not be possible.)
  6409. \b A description of what the problem actually \e{is}. `It doesn't
  6410. work' is \e{not} a helpful description! Please describe exactly what
  6411. is happening that shouldn't be, or what isn't happening that should.
  6412. Examples might be: `NASM generates an error message saying Line 3
  6413. for an error that's actually on Line 5'; `NASM generates an error
  6414. message that I believe it shouldn't be generating at all'; `NASM
  6415. fails to generate an error message that I believe it \e{should} be
  6416. generating'; `the object file produced from this source code crashes
  6417. my linker'; `the ninth byte of the output file is 66 and I think it
  6418. should be 77 instead'.
  6419. \b If you believe the output file from NASM to be faulty, send it to
  6420. us. That allows us to determine whether our own copy of NASM
  6421. generates the same file, or whether the problem is related to
  6422. portability issues between our development platforms and yours. We
  6423. can handle binary files mailed to us as MIME attachments, uuencoded,
  6424. and even BinHex. Alternatively, we may be able to provide an FTP
  6425. site you can upload the suspect files to; but mailing them is easier
  6426. for us.
  6427. \b Any other information or data files that might be helpful. If,
  6428. for example, the problem involves NASM failing to generate an object
  6429. file while TASM can generate an equivalent file without trouble,
  6430. then send us \e{both} object files, so we can see what TASM is doing
  6431. differently from us.