B2B SHIPPING GUIDE

Evidence-Grade Delivery

The new architecture of B2B logistics


When your shipper argues from memory and the carrier argues from data, you lose. That's the pattern behind disputed claims, missed dock appointments, emissions figures no auditor will counter-sign, and vendor platforms that demo nicely but fall over in month #3.

Evidence-Grade Delivery is the standard that puts you on the data side of the table. Read the guide and you'll have the vocabulary to name the gap, the framework to brief a vendor in one session, and the pilot path to prove a platform before it adds to your contract stack.

hero-accent-no-shadow
Get tailored advice on B2B shipping and your delivery operations:

Section 1

Why B2B shipping breaks differently

A common pattern on the dock

The driver pulls into a contracted receiver inside the booked window. The paperwork is correct. The dock is not ready. Earlier in the morning, the receiving line moved forward by forty minutes to clear a different supplier, and the appointment the shipper booked two weeks ago no longer matches the reality of the line. The receiving supervisor waves the driver off and points to tomorrow's slots. The truck leaves with the freight still on it.

Finance

Invoices on the original cycle.

Service

Hears about the failed drop from the receiver the next day.

Operations

Learns about it from finance the week after.

That sequence is the shape of B2B shipping failure. The surface reading is that the carrier let the shipper down. The deeper reading is that nobody in finance, service, or operations ended up with a shipment record they could act on before the receiver called.

Consumer delivery playbooks do not describe this operation. The consumer receives a parcel, the relationship sits inside a retail brand's customer-service loop, and a failed delivery is a ticket. B2B delivery sits inside a commercial contract. Delivery performance is a line item in that contract. Appointments are treated as schedules, not windows. Every shipment is expected to produce evidence that will hold up against the other party's records. When any of those three gives way, the cost lands inside the business, not in a support queue.

At one end of the pressure, a half-hour window is the normal operating unit. Unilever Sweden runs into retailers like ICA and COOP with that tolerance, on around eight hundred pallets a day, out of a cold warehouse holding more than ten thousand. That is not a special case. It is the shape of B2B freight into large receivers. Miss the window, and the pallet rides back to the warehouse.

Fault lines

Three fault lines B2C playbooks ignore

Contracts

Most B2B shippers are tied into commercial agreements with service levels, appointment regimes, and penalty structures negotiated at scale. A missed dock window is a breach of a specific commitment, and the receiver will reference it in the next commercial review. Carrier choice is sometimes dictated by those contracts too. A receiver or marketplace may require that a particular carrier handle its inbound, leaving the shipper to solve for that carrier inside a platform whether it fits the rest of the operation or not.

Appointments

B2B freight runs against booked slots. Receiving lines, dock doors, and cross-docks have finite capacity allocated across dozens of inbound shipments a day. A truck inside its window is taken. A truck outside it is refused, and the rebook is usually the next day. Fifteen minutes late can cost the drop entirely. The bill arrives in pieces: a return journey, a storage charge, a rebooked slot, and a receiver who starts tightening their own appointment regime to protect themselves.

Evidence

Every B2B shipment should produce a commercial record. Proof of shipment and proof of delivery, exception data when something deviates, a claims pack when goods arrive damaged or short, and emissions data that customers increasingly ask for in their own reports. These records are commercial instruments. They settle invoices, win or lose disputes, and feed the reporting cycles sitting above the supply chain. Where the evidence is weak, the shipper argues from memory while the carrier argues from data.

Baseline conditions

What a head of supply chain should be able to assume

Four conditions matter most. A supply-chain leader under pressure to hold service, control cost, and report emissions accurately should be able to take them as given.

Carrier performance is measurable against the contract.

Numbers read per lane and per service, not only per carrier, and not only at the annual review.

Exceptions surface early enough to act.

The window for action closes when the receiver opens the dispute. Anything that arrives after that is cleanup.

Any shipment timeline is retrievable.

Whoever is pulled into the conversation can find it without assembling it from three systems, and the people retrieving it agree on what it says.

A claim is retrievable as one pack.

The manifest, the scan events, the POD, the exception note, and the photograph all live in the same place when the claim window opens.

Most shippers cannot take any of this for granted. The operation sits on top of WMS, TMS, ERP, carrier portals, and service inboxes that were wired to run the physical movement, not to produce commercial evidence at the shipment level.

Where the damage lands

Weak evidence shows up as a pattern over time

Invoice disputes go the wrong way, because the carrier's data is cleaner than the shipper's. Finance, service, and operations arrive at the same internal meeting with three different versions of the same shipment, and spend the meeting reconciling instead of deciding. Contracted receivers start engineering their own process around the shipper, adding inspection steps or tightening windows. That engineering is a quiet escalation, and it eventually lands in the commercial conversation. Emissions reports get built from estimates rather than shipment reality, and as receivers ask for the methodology, the estimates fall apart.

Some shippers hold one or two of these conditions on their core lanes. The leak opens where the evidence has to cross a team or a system. These consequences rarely arrive as a single crisis. They accumulate, and they usually resolve as tighter receiver terms, less tolerance on appointments, or a smaller share of an account's inbound volume.

What control looks like

The record is written as the shipment moves

The operation that holds up has the same shape every time. The record is written as the shipment moves, and the evidence is already there when a question is asked of it.

A single execution surface.

Every shipment is prepared to a common standard across carriers and modes, so the shape of the record does not depend on who booked it.

A normalized milestone layer.

Events returning from carriers are reshaped into one vocabulary, so a shipment's timeline reads the same way whether it moved by parcel, pallet, or container.

An evidence posture.

Claims packs, customs documents, and dispute material are retrievable as units when the question comes, not assembled from inboxes and folders on the day of the deadline.

Emissions reporting on the shipment record.

Numbers read directly off the movement against recognized methodology, so the report starts from reality, not from a spreadsheet of averages.

Proof · Unilever Sweden

Unilever Sweden's Helsingborg operation is a direct expression of what it takes to run inside tight B2B windows at real volume. A cold warehouse with a capacity of over 10,000 pallets, dispatching around 800 pallets daily into retailers like ICA and COOP. The shipments are timed down to the half-hour, and the system underneath has to hold.

We often have half an hour to deliver our goods. If we miss that time slot, we simply have to turn around and go back to the warehouse again.

It is very important to us that our transports go exactly according to schedule. That places a high demand on our systems to always be stable, which [nShift's] always have been.

Customer context

Unilever Sweden

Cold-chain pallet distribution into ICA and COOP. Running on nShift.

Tord Tillman Process Specialist, Unilever Sweden

The next section breaks that pressure into five shipment archetypes, because each one fails in a different place.

Section 2

The five shipment archetypes

Most B2B shippers run more than one kind of shipment out of the same operation. A supply-chain leader oversees multi-parcel deliveries into trade customers, pallet freight into retail, multimodal movements that hand off between road and sea, cross-border exports that depend on pre-arrival customs data, and dangerous-goods or temperature-sensitive shipments with their own paperwork regimes. Each one breaks differently, and the evidence each one needs to hold up commercially is different too.

Treating those flows as one undifferentiated mass is the first place operations lose control.

A platform that handles parcel well but cannot carry a pallet appointment regime will cost the operation every time a dock window fails. The matrix below summarizes how each archetype loads, the carrier mode it rides on, the evidence it produces, and the seam where it tends to fail. The five sections that follow walk through one archetype at a time.

Overview

Archetype matrix

Archetype Load shape Carrier mode Evidence requirement Where it breaks
Multi-parcel B2B Parcels grouped under one commercial order Parcel carriers, often several in parallel Order-level manifest linking parcels to line items Parcels split, one short, service reconstructs from carrier portals
Pallet freight Single pallets or part-loads into appointment-based receivers General haulage, dedicated regional carriers Booked appointment tied to shipment, POD with window outcome Missed window, refused intake, rebooking lands on the shipper
Multimodal One consignment moving across two or more modes Road plus sea plus road, or similar Shipment record that holds across legs, document set that follows the freight Visibility dies at the handover; planners start calling the forwarder
Cross-border Any load crossing an external border Any mode; mode matters less than the data Pre-arrival customs data, structured freight information, commercial documents Data not in the state it should be when the freight is ready, movement held
Dangerous goods and temperature-sensitive Regulated hazardous loads or controlled-temperature loads Carriers with DG or cold-chain certification DG paperwork, temperature evidence, audit-grade record tied to shipment Rejection at destination, regulator finding, evidence gap at audit

Archetype 1

Multi-parcel B2B

A familiar scene in service

A B2B customer takes delivery of an order shipped as three parcels. Two arrive together in the morning. The third arrives later the same day. When the receiving clerk opens the third parcel, one unit is short against the purchase order. A photograph reaches the shipper's service team within the hour. The service team can see three parcels marked delivered. They cannot see quickly that the three parcels belong to one order, and they cannot see what was in each parcel against the order lines. The conversation is spent catching up. The receiver already knew.

Multi-parcel B2B fails at the seam between the parcel and the order. Parcel-level tracking is a commodity; every carrier provides it. Order-level truth is harder, because it lives in the shipper's systems and is rarely reconciled to the shipment record until someone has a reason to reconcile it. That reason is usually a dispute, and by the time it arrives the clock is already running against the shipper.

The record the operation needs is not complicated. The order exists as one object, with parcels tying back to it and scan events from each carrier reconciling against it in one view. When parcels separate in transit, an exception is raised against the order, and the outcome is recorded. When a receiver reports a shortage, the shipper can see which parcel held which unit and where the evidence gap opened.

When that evidence is weak, service rebuilds the order from carrier portals and warehouse notes while the receiver waits. Some shortage claims get conceded because defending them costs more than paying them. Others are contested on carrier evidence the shipper cannot match, and lost. Receivers eventually adjust their inspection regime, which slows inbound further.

Proof · JYSK

JYSK runs multi-country parcel B2B across Sweden, Denmark, Poland, and Bulgaria on nShift Ship, with the platform integrated into CRM and SAP.

We get track and trace data from carriers via EDI from [nShift]. This is integrated to our CRM system, where customer service can troubleshoot and track parcels fast and easy with one-click access.

Customer context

JYSK

Home furnishings retail, 3,000+ stores across 50+ countries. Running on nShift Ship.

Ole Rønnest Nielsen Head of Logistics IT, JYSK

Archetype 2

Pallet freight

Thirty-minute drift

The transport planner booked the window six weeks ago on a lane the shipper runs every Tuesday into a major retailer. The receiver's line moves. It ran thirty-five minutes ahead on that particular Tuesday because a separate supplier cleared faster than planned. The driver arrives inside the booked window and finds the line past his slot. The dock supervisor waves him off and points to next Wednesday. The pallet rides back to the warehouse. The origin-side planner learns of the refusal from the carrier's rebooking system, not from the dock, and the receiver records the failure in its supplier scorecard.

Pallet freight is the archetype where time becomes money down to the half-hour. Receivers allocate dock doors in tight windows across dozens of inbound shipments a day. The shipper's booking is one slot in that allocation, the carrier rate is priced against it, and the truck's next drop is planned against it. When the window fails, the downstream plan fails with it.

The failure comes from a mismatch between the window the shipper booked and the window the receiver is actually running. Receiving lines move. Book-to-drop drift of thirty or forty minutes is routine; an hour is not unusual on a busy day. The driver who shows up inside the original window finds the line isn't ready. The freight turns around, finance invoices on the original cycle, and the receiver references the failure in the next commercial review.

The operation needs a booked appointment held on the shipment record, a confirmed carrier ETA that can be compared to that appointment as the truck moves, and a proof of delivery that captures the outcome of the window. Dock-level exception detail has to read the same way across carriers and lanes, so a pattern of refusals is visible in the aggregate rather than only in the anecdote.

Proof · Unilever Sweden

Unilever Sweden dispatches around 800 pallets a day from their Helsingborg cold warehouse into retailers including ICA and COOP, on nShift TMS.

The information from [nShift] makes it possible for us to constantly measure and improve our deliveries.

Customer context

Unilever Sweden

Cold-chain pallet distribution into Swedish retailers. Running on nShift TMS.

Tord Tillman Process Specialist, Unilever Sweden

Archetype 3

Multimodal

A shipment in three acts

A consignment moves road, then sea, then road again. Each leg has its own carrier, its own document set, and its own time signature. Nothing in the shipper's operation naturally reconciles the three legs, because the systems were bought to serve different jobs. Visibility dies at the handover, where one leg's events stop arriving and the next leg's have not yet started. Planners stop trusting the data surface and start calling the forwarder. The call itself is expensive, and the pattern of calling is a symptom.

The operation needs a shipment record that survives the leg change, documents that follow the freight rather than the system boundary, and ETA that updates honestly across modes. Multimodal ETA is often provisional: a port has not assigned a berth, a driver has not been booked for the inland leg, or a vessel has been reallocated. The record has to carry that provisional state plainly, rather than smoothing it into a confident timestamp the operation will later discover was wrong. A single view of where the shipment is and who has custody of it, readable by operations without translation, is the minimum surface this archetype requires.

Most multimodal shipments go wrong at the handover. Exceptions surface late. Customer-facing commitments get made on partial information, because the planner has to commit before the forwarder has called back. The cost shows up in customer service first and in renegotiated rates second.

What this archetype requires is a shipment spine that holds across modes: carrier and forwarder data flowing in as events, documents generated from the shipment record instead of assembled from inboxes, and ETAs that carry confidence levels alongside timestamps.

Proof · Mobile Climate Control

Mobile Climate Control manufactures HVAC systems for commercial and specialty vehicles, with production sites across Europe, North America, Asia, and Africa, running complex multimodal transport flows on nShift TMS.

It is already difficult to imagine running the business and managing the company in terms of shipment planning and control without the support of [nShift].

Customer context

Mobile Climate Control

HVAC manufacturing across four continents, complex multimodal flows. Running on nShift TMS.

Mariusz Król Managing Director and President for Europe, Mobile Climate Control

Archetype 4

Cross-border

A familiar scene in cross-border operations

The freight is ready at the origin warehouse. Paperwork looks complete at first glance. The pre-arrival data set the carrier needs to submit on the shipper's behalf is not in the state it should be. Under the current customs regime, the movement requires specific structured data before the freight physically moves. A missing element sits in a different system; nobody on the origin side is sure which. The broker needs it inside the hour. The freight is held. It is a short delay at origin that would be invisible on most dashboards, but the receiver notices, and the next booking is conditional.

In cross-border, the bottleneck is the data. When the shipment is also multimodal, the data problem compounds, because documents need to follow the freight across the leg change as well as across the border. Under ICS2 Release 3, operational across all transport modes in the EU since September 2025 with limited derogations in some member states, advance cargo information has to reach the customs authority before the goods arrive, in a structured form. The eFTI Regulation moves in parallel: freight information is shifting from paper into structured electronic data across road, rail, inland waterways, and air.

Shippers who generate the required data from the shipment record, at the moment of shipment preparation, clear these requirements without the scramble. Shippers who still assemble the data from multiple systems run into dwell time at origin, and increasingly at borders where the data cannot be produced on demand.

The operation needs commercial documents generated from shipment data rather than rekeyed, customs-ready data on the shipment before it leaves the origin dock, a record of what was sent and when in a form the customs authority accepts, and export documents and specialist labels produced automatically for the destinations that require them.

Border dwell is expensive and visible. Customers do not quietly rebook around it; they raise it. Repeated dwell patterns become a procurement conversation, and then a contract conversation. Preferred-supplier status on lanes with tight service expectations erodes quietly.

Proof · Van Gogh Museum

The Van Gogh Museum ships merchandise globally from its webshop and restocks four retail stores several times a week, on nShift Transsmart integrated with Exact Globe ERP. Each destination has its own carrier coverage, export documents, and specialist labels, all selected and generated automatically by the ERP.

For each purchase, the correct carrier is automatically selected by the ERP system based on the shipment destination. [nShift] ensures the automatic generation of the required documents and special labels. Previously, this was all done by hand.

Customer context

Van Gogh Museum

Retail, webshop, and fragile-goods cross-border shipping. Running on nShift Transsmart, integrated with Exact Globe ERP.

Peter Vogler Head of Logistics & Planning, Van Gogh Museum

Archetype 5

Dangerous goods and temperature-sensitive shipments

A scene at the loading bay

A third-shift dispatcher closes out an order of a flammable industrial solvent. The UN-coded paperwork is half-generated, and the label run on the other printer is waiting on a carrier qualification the system did not auto-check. The freight is already on the tug. A second driver, carrying a chilled pharmaceutical consignment, asks for the temperature log that needs to live against the shipment before the trailer leaves. The log sits on a handheld device the warehouse passes between shifts. The carrier wants it in one file, tied to one manifest, in the next ten minutes.

Two shipment types share this archetype because they share one trait: the paperwork and the handling regime are as load-bearing as the freight itself. Dangerous goods require UN-coded documentation, handler certification, and carrier qualification. Temperature-sensitive cargo requires continuous environmental evidence tied to the shipment. Errors in either delay the load and create regulatory exposure that shows up at audit.

The operation needs DG documents generated correctly and automatically from the shipment record, temperature evidence captured continuously and tied to the shipment, and a record that holds up to an audit as well as to an invoice. Where a receiver rejects a load on environmental or documentation grounds, the evidence to dispute or concede the rejection is already assembled.

Rejection at origin, caught before dispatch, is bearable; the documents get fixed. Rejection at destination is expensive in direct cost and worse in reputation, and an audit finding is worse still because audit findings travel. In the pallet-freight archetype, weak evidence costs money in rebookings; here it crosses into regulatory and commercial risk.

Document automation at the shipment level removes the person from the loop on the standard cases. Evidence capture ties to the shipment record directly, and human attention is reserved for the edge cases that genuinely need it. That is the configuration that scales when volume grows and DG regulations tighten at the same time.

Proof · Stihl

Stihl runs dangerous-goods shipping at volume from its Stenkullen warehouse in Sweden, on nShift Ship, integrated with Navision.

[Now] we automatically print these labels, saving us two to three minutes in processing time per order, and if you have 70 dangerous goods shipments a day, this is big savings.

Customer context

Stihl

Power-tool manufacturing, dangerous-goods dispatch at volume. Running on nShift Ship, integrated with Navision.

Peter Hermansson IT Manager, STIHL

All five archetypes depend on the same thing: the shipment has to carry its own evidence. When the evidence travels with the freight, every downstream function reads from one record. When it arrives later as reconstruction, every function rebuilds its own version. The next chapter names the standard that removes the reconstruction.

Section 3

Evidence-Grade Delivery

Four in the afternoon on a claims deadline

The claims owner sits with the shipment number on a sticky note, pulling the manifest out of the warehouse system, the scan events out of a carrier portal, the proof of delivery out of a driver's tablet archive, the exception note out of a dock supervisor's email, and the photograph out of a service inbox. The receiver already closed its file and wants its credit. The carrier already closed its file and wants the charge confirmed. Finance already raised the invoice that now needs a line adjustment. Nobody in the room is doing new work. Everyone is reconstructing what the shipment already did.

That afternoon is the cost of a shipment that did not carry its own evidence. This chapter names the standard that removes the cost: Evidence-Grade Delivery.

Evidence-Grade Delivery is a shipment-level standard.

It names the evidence a B2B shipment has to carry so the shipper can run commercially, operationally, and sustainably without reassembling the record each time a question is asked of it.

Most supply chain conversations start with visibility dashboards. The dashboard reads well when each shipment underneath has produced its own scan events, POD, and exception notes. In most B2B operations today, the evidence for a shipment is produced after the fact, by people, from fragmented systems. Claims packs get assembled from inboxes when a claim lands. Customs data gets rekeyed at the moment a shipment has to move. Emissions reports get built from estimates because the shipment record does not carry the inputs. Each assembly takes time, and each introduces the error the other side can exploit.

Evidence-Grade Delivery means each booking, scan, POD, and exception lands on the record as the shipment moves. A shipment either carries it or it does not. Archetype, carrier, lane, and customer do not change the definition.

The standard

A shipment that meets the standard carries six pieces of evidence

Order-level link

Every parcel, pallet, and leg ties back to the commercial order it belongs to, so the shipment can be read against the purchase order and against the parcel number from the same record. That matters when a receiver reports a shortage, a carrier disputes a delivery count, or finance reconciles an invoice against a partial arrival.

Normalized milestone data

Carriers report events differently: one's "out for delivery" is another's "delivery scheduled," and exception vocabularies overlap only partially. The standard translates those events into one common vocabulary, so a shipment's timeline reads the same way regardless of who carried it. Service, finance, and operations can then talk about the same shipment without translating carrier language first.

Documents generated from the shipment record

Commercial invoices, carrier paperwork, customs data, DG labels, and specialist export documents are produced from the shipment's own data rather than rekeyed from other systems. That is what keeps ICS2 pre-arrival data structured before the freight moves, and it is what gives an eFTI-aligned authority electronic freight information in the form it expects.

Exception detail at the point of exception

A refused dock intake is useful only if it is recorded against the shipment at the point it happens, with the reason and the rebook proposal attached. Otherwise it arrives downstream as a gap in the tracking data that service has to reconstruct.

Claims pack as one retrievable unit

Manifest, scan events, proof of delivery, exception record, receiver communication, and carrier response sit against the shipment, already together when a claim is needed. The claims window becomes a working timeline the shipper manages, not a deadline that runs against the shipper while the material is still being assembled.

Emissions data tied to the shipment

Emissions numbers are computed from the shipment's actual execution, against a recognized methodology. The report can be produced on demand, and the inputs can be audited by anyone who wants to inspect them.

Those six elements are the definition. A shipper can audit their own operation against them, line by line, and see where the standard is being met and where the evidence is being assembled the hard way.

Integration

The shipment record at the center

The six elements land at different moments: booking, pickup scan, POD, exception, claim reply, and emissions, all on the same record. The claims pack and the milestone view read the same scan events, POD, and exception notes. The emissions report and the cross-border customs data read off the same shipment record. A partial record produces partial evidence for every team that reads it.

Where the cost shows up

Weak evidence is expensive, and the cost is distributed

Each of the six elements maps to a specific kind of damage when it is missing. A weak order-level link produces service time spent reconstructing shortages and a pattern of receivers quietly tightening their own inspection regimes. Weak milestone normalization produces a tracking surface the team does not trust, and a rising volume of phone calls to forwarders and carriers. Document generation done outside the shipment record produces dwell at borders, rekey errors, and mismatches that cost the shipper the invoice dispute later. Exception detail captured only after the fact produces claims that are conceded because defending them would cost more than paying them. A claims pack that has to be assembled is a claims pack that is sometimes not assembled, a reduction in recovered value that finance eventually notices. Emissions reporting built from estimates produces numbers that hold up only as long as nobody with authority wants to look closely.

The pattern across all six is the same. Weak evidence is expensive, and the cost is distributed across service, finance, operations, and reporting, so no one team carries a number they can escalate against. The leak rarely appears as one headline figure. It shows up in margin over time.

How control returns

Evidence-Grade Delivery runs in three layers

Execution surface

Carrier selection runs against rules held centrally. Documents generate from the shipment record. DG paperwork, cross-border customs data, and commercial invoices come out of the same preparation step. Carrier differences are hidden from the operation, and a carrier change becomes a configuration exercise rather than a systems project.

Milestone and exception layer

Events from every carrier are normalized against one vocabulary. Exceptions are captured against the shipment at the point of exception. Where a carrier's data is weak, the layer carries the honest state, including the confidence level, so the operation can act on the evidence it has rather than on the evidence it wishes it had.

Reporting and reverse-flow layer

Claims packs, emissions reports, and performance views read directly off the shipment record. nShift Returns runs as claims and reverse-flow discipline against the same shipment data that produced the original execution. The emissions report is a query against that data, rather than a reconstruction from estimates.

Proof · Dahl Suomi

Dahl Suomi, part of Saint-Gobain, is Finland's leading HVAC wholesaler, dispatching more than 700,000 parcels a year across 18 sites on nShift Ship, integrated with its ERP and Leanware WMS. The sites do not all run the same way. Smaller warehouses work with the interface directly; the Helsinki warehouse operates behind a WMS. One shipment record has to carry the same evidence across both.

At our smaller warehouses, for example, team members interface with nShift directly. But at our large Helsinki warehouse, where we have implemented a WMS, it works away invisibly in the background without people even really knowing it's there. So while everyone is connected to the same system and data, the user experience can vary according to the needs of the team.

Customer context

Dahl Suomi

Finland's leading HVAC wholesaler, part of Saint-Gobain. 700,000+ parcels per year across 18 sites. Running on nShift Ship, integrated with ERP and Leanware WMS.

Vaino Pokki Training and Development Specialist, Dahl Suomi

The next section turns to the five places operations most often fail to meet the standard.

Section 4

Where operations lose control

A Thursday morning at the planner's desk

The planner pulls up the week's dashboard. Two carriers show every shipment as "in transit" when yesterday's phone calls from customers confirmed three of them arrived. The exception feed from the third carrier is empty, even though the driver filed a refused intake on Tuesday. The planner opens the carrier portals one by one and by lunchtime has rebuilt the picture of where the freight actually is. The forwarder gets the next phone call. The dashboard gets closed. By the end of the week, nobody on the team opens it first anymore, and nobody owns the drift, because the cost lands in service hours and planner hours nobody counts.

The dashboard, the carrier rules, the claims pack, the reconciliation across ERP and WMS, and the emissions report all draw from the same shipment record. When the record holds up, each of them works. When the record is partial, all five break in their own way.

Operations do not lose control in one place. They lose control in five predictable places.

Each of the five is fixable on its own, and cheaper to fix than it costs in service hours. The rest of this section names them.

Point 1

Visibility gaps across carriers

Milestone events arrive in different shapes from different carriers. One returns "out for delivery," another returns "delivery scheduled," a third returns nothing between pickup and arrival. Exception vocabularies overlap only partially, and when a carrier invents a new status code to handle a new service, it may be weeks before the shipper's dashboards recognize it.

Planners start checking carrier portals before they trust the dashboard, and after a few weeks they stop opening the dashboard first at all. The phone starts ringing to forwarders. A surface that should give the team one working view becomes a loose collection of half-translated carrier events, and the cost lands in service hours and planner hours that never come back.

A normalization layer between the carrier feeds and the team is what keeps the surface usable. Events resolve into one vocabulary, regardless of who the carrier is. When a carrier reports nothing, the absence reads honestly as an absence, with the last known position and a confidence level attached, so the team can act on the evidence it has rather than assume something has been missed. Exception detail is captured at the carrier and presented at the operation's level.

Proof · Berggård Amundsen

Berggård Amundsen ships electronics across Norway to installers, power plants, and construction sites. More than 1,200 parcels a day move through a main warehouse and 25 service centres on up to 80 routes. Without one view across that carrier mix, shipments fall out of sight.

With [nShift], we don't waste time entering and finding data or tracking parcels in different systems. The integrations have improved our track and trace capability and allow all Berggård Amundsen employees to track all deliveries throughout the Group.

Customer context

Berggård Amundsen

Norwegian electronics wholesaler. Main warehouse plus 25 service centres, several carriers, up to 80 routes a day. Running on nShift Ship, integrated with Astro WMS and Infor M3.

Rolf Inge Danielsen Logistics Director, Berggård Amundsen

Point 2

Governance of carrier rules and connectors

Carrier product offerings change constantly. A new service launches, a label format is updated, a rate card is restructured, a document required for a specific lane is deprecated and replaced. Maintaining label generation and rule logic in code creates an invisible maintenance backlog. Each change is small. The cumulative shape of the backlog is that shipments quietly drift out of compliance with the carrier's current state, and the evidence of that drift is a rising volume of small corrections inside the service queue, the dispute queue, and the rekey queue.

The cost of carrier maintenance is usually miscounted because it lands across several teams at once. Integration engineers update a label format, operations writes a workaround for a rule the platform does not yet enforce, service handles the exceptions the workaround produces, and finance pays the invoices where the rule was applied incorrectly. No one team holds the number, so nothing gets escalated.

Carrier governance works better as a platform capability. Rule changes land centrally, get tested against the shipper's operation, and roll out without the shipper's engineering team writing new integration code. New carriers enter the library as a configuration exercise, and the planner decides which carrier runs which lane directly.

Proof · Prime Cargo

Prime Cargo handles thousands of parcels a day from Denmark and Poland to customers across Europe and Asia. Each of their carriers brings its own label formats, data requirements, and update cadence. The warehouse flow has to absorb all of that without slowing down at peak.

Before, we coded our own labels for the different carrier products in our WMS system, which was time-consuming due to continuous updates from the carriers. Today, [nShift] does all this, so we instead are able to focus on what we do best.

Customer context

Prime Cargo

Global 3PL shipping from Denmark and Poland across Europe and Asia. Running on nShift Ship, integrated directly into WMS.

Peter Slatcher Logistics Manager, Prime Cargo

Point 3

Claims assembly

Claims is where the shipment's evidence posture is tested hardest. A receiver reports a shortage, a damage, a wrong count, a missed appointment. A carrier's claims window opens. Inside that window, the shipper has to assemble enough evidence to hold the carrier to account, and the carrier has a version of the same data that is typically cleaner, because the carrier built it into their workflow from the beginning.

When the evidence lives in four places, the assembly is what takes the time. Every claim starts with a scavenger hunt across the warehouse system, the carrier portal, the driver's tablet, and the dock supervisor's inbox. The hunt is expensive, and the hunt is also incomplete, because some of what would hold the shipper's case is not in anyone's inbox and never was. Some claims are dropped because the recovery does not justify the work. Others are defended on thinner material than they should be, and lost.

The operating model keeps the pack building in real time. Manifest, scan events, proof of delivery, receiver communication, carrier response, and exception detail all arrive at the pack at the moment they are produced, not at the moment they are needed. When the receiver's note lands, the pack is already there. The claims window becomes a working timeline the shipper manages, not a deadline running against the shipper while the material is still being assembled.

Point 4

Cross-system drift

ERP, WMS, TMS, and customer service each hold a partial view of the same shipment. ERP knows the commercial terms. WMS knows what was picked and packed. TMS knows what was booked. Customer service knows what the receiver has said about it. In most operations, these views agree on the big facts and disagree on the small ones. Status, count, timing, carrier, service level, cost. Small disagreements accumulate into large reconciliation meetings.

The reconciliation is the tax on working in separate surfaces, and the cost sits inside the operating hours of people whose job is not reconciliation. Finance arrives at a meeting with one version of a shipment. Operations arrives with another. Service arrives with a third. The time spent closing the gap is time nobody will claim back, and the drift is invisible to the senior leader reading a dashboard, because the dashboard is itself one of the disagreements.

Shipment truth has to live in one surface, with the rest of the stack reading from it. Execution, milestones, exceptions, and outcomes sit on the shipment record. Integrations into ERP, WMS, and customer service draw from that record rather than maintain their own version of it. When a receiver opens a conversation with service, service is working from the same record finance will invoice against and operations will plan against.

Proof · Solar Screen

Solar Screen ships smart film products across four European markets on nShift Transsmart, integrated with NetSuite ERP. Before the integration, shipment status, tracking, and cost data all lived in separate systems, and warehouse operators had to switch between carrier interfaces manually.

We can now upstream information in our ERP regarding cost and tracking status, which makes life much easier for managing transport and provides customers with more information about their shipment.

Customer context

Solar Screen

Smart film manufacturer, four European markets. Running on nShift Transsmart, integrated with NetSuite ERP.

Nicolas Hoet Chief Operations Officer, Solar Screen

Point 5

Emissions reporting

Emissions has arrived inside the commercial relationship. Large receivers are asking for shipment-level numbers against a recognized methodology. CSRD-aligned reporting, supplier questionnaires, and sector-specific programs are converging on the same expectation: the shipper can account for the emissions of a specific shipment, not an average across a quarter or a category.

The problem is the input. An emissions number built from modal averages and aggregated distance is an estimate. The estimate holds only as long as nobody with authority wants to look at the inputs. When a receiver's procurement team asks for the methodology and the source data, the estimate falls over. The shipper is then in the position of building the report and defending the inputs at the same time, which is a conversation the shipper does not win.

The alternative is to compute emissions from the shipment record. The execution surface already knows which carrier moved the load, on which lane, with which service, at which weight. Against a recognized methodology, those inputs compose a number that can be produced on demand and audited by anyone who wants to audit it. The report becomes a query against shipment data, rather than a reconstruction from averages that were never tied to the shipment in the first place.

Most of the cost of weak evidence in a B2B operation shows up across these five points. The platforms that hold up in multi-carrier operations carry all five on the same shipment record, not as five separate modules. Each of the five is addressable on its own, and each is cheaper to design out than to absorb. The practical question for a buyer is how to test a platform against them before commitment, so the evaluation runs against operating reality rather than against a feature list. The next section builds that test.

Section 5

Buyer evaluation framework

Evidence-Grade Delivery becomes concrete once a buyer sits across from a platform vendor. The six elements turn into six questions the buyer can ask directly. No scoring grid is needed. What separates a platform that produces the evidence from one that sells it is how specifically the vendor can describe each element of the shipment record, and where each element lives when a shipment is in motion.

The shape of the answers is usually clear inside the first working session.

The six questions follow the shipment from creation through to the reporting it produces afterward, in the same order the operation meets them. Taken together, they frame an evaluation that reads the way a shipment reads.

Evaluation framework

Six questions nShift uses with buyers

We use these six questions with buyers who want to test whether a platform actually produces the evidence the standard describes. Each question below links to its deep dive, and every question has a live operational test inside the pilot chapter that follows.

Question 1

Order-to-shipment integrity

Does the platform hold the shipment against the commercial order, or only against the parcel?

The first question is whether the platform holds the shipment against the commercial order, or only against the parcel. Every B2B operation that ships more than one unit per order depends on this link, and the link is the first thing to fail when a shortage lands with a receiver.

The live test is a split order: three parcels on the same purchase order, separated in transit across two carriers, arriving on different days, one of them short. The buyer watches what the platform shows while that sequence unfolds. If the workflow treats the parcel as the anchoring object, the conversation with the receiver will start from the wrong place every time, because the service team will be reconciling parcels to an order the platform does not hold. If the workflow holds the order as the anchoring object, with parcels reconciling against it and exceptions surfacing at order resolution, service starts the conversation from the record the receiver already has.

Question 2

Carrier coverage and governance

Where does carrier maintenance live, and how fast does the platform reflect a carrier's changes?

Coverage reads as two questions. The first is the size of the carrier library and whether the specific carriers a given operation depends on are already inside it. The second, usually the one that determines whether the platform holds up, is how the platform handles change. Carriers update label formats, introduce new services, restructure rate cards, and deprecate documents on their own schedule. A shipper that maintains its own integration code against those changes is running an invisible maintenance backlog, and the backlog grows.

The decisive issue is where that maintenance lives. When carrier governance sits inside the platform, configuration tracks the carrier's current state without the buyer's engineering team in the loop. A slide full of integrations says less than the operational details: how quickly a label-format change appears, what happens when a carrier launches a new service on a Monday afternoon, and who inside the vendor signs off the rollout.

Proof · Aditro Logistics

Aditro Logistics is a carrier-independent 3PL serving more than 100 retail, grocery, and industry customers across large warehouses in Norway and Sweden. Every carrier in that library brings its own routes, label formats, product definitions, and validation rules, updated on the carrier's schedule. The operation runs on nShift Ship integrated with Jeeves, Astro WMS, and Ongoing WMS, with carrier governance held inside the platform.

Our customers experience fewer errors in deliveries due to better tracking and control as well as a higher delivery speed. Carriers are pleased that we use [nShift], as we are able to handle their demands for updated routes, formats, products, validation and so on. This means that carriers do not have to wait for us and that we do not ship to wrong addresses.

Customer context

Aditro Logistics

Nordic carrier-independent 3PL, 100+ retail, grocery, and industry customers. Running on nShift Ship, integrated with Jeeves, Astro WMS, and Ongoing WMS.

Fredrik Krysén Transport Director, Aditro Logistics

Question 3

Normalized milestone and exception data

Does the platform translate carrier events into one vocabulary, and what does it show when a carrier reports nothing?

A milestone feed that returns ten thousand events a day in ten different vocabularies is a feed, not a surface. For a domestic single-carrier operation, this matters less. For a multi-carrier or cross-border network, one vocabulary across every carrier is among the first things a buyer should test. The platform has to translate carrier events into that common vocabulary, and has to carry exceptions honestly when a carrier reports nothing useful.

The narrow test is the quiet carrier. On a shipment moving with a regional haulier that reports poorly, what does the platform show the team between pickup and delivery? If the surface is the carrier's logo and not much else, the normalization layer is thin. If the absence reads as absence, with the last known position, the carrier's historical behavior on that lane, and a confidence level attached, the normalization layer is doing the work it was built for. A live walkthrough of that absence tells the buyer whether the milestone surface is real or stitched together.

Question 4

Document and customs readiness

Are documents generated from the shipment record, or rekeyed from another system?

Documents are where the shipment's evidence holds up at the border or collapses. The decisive point is where the data comes from. Commercial invoices, customs data, DG paperwork, and export labels generated from the shipment record arrive in the right shape when the carrier or the customs authority asks for them. When the shipper has to rekey fields from another system into a shipping screen, the rekey errors and the border dwell follow.

ICS2 Release 3, operational across all transport modes in the EU since September 2025 with limited derogations in some member states, made pre-arrival cargo data a structured requirement rather than a best practice. The eFTI Regulation is extending the same posture across road, rail, inland waterways, and air. In 2026 and beyond, a B2B buyer should expect any serious platform to explain how it supports these requirements. The useful questions are how the required data is generated, how it is maintained against interpretation changes, and who owns the update when a customs authority changes how it reads a field. A polished walkthrough of the form does not answer that.

Question 5

Claims and reverse-flow evidence

Is the claims pack retrievable as one unit against the shipment, or assembled from inboxes when a claim arrives?

Claims is where the full shipment record is tested against a deadline. The pack either sits as one unit against the shipment or it has to be assembled from inboxes when a claim arrives. When claims runs as a reverse flow against the shipment record, the evidence stays continuous and the pack is already there when the carrier's window opens. If shipment data is only fed into claims after the fact, assembly becomes its own discipline and the window runs against the shipper.

The operational test is the clock. Given a shortage report at 9 a.m., with the carrier's claims window closing in forty-eight hours, how much of that time goes to assembling the pack, and how much to defending or negotiating it? If assembly consumes the window, the evidence posture is weak, regardless of how the surface looks in a controlled walkthrough.

Question 6

Emissions reporting at the shipment level

Can the platform produce a shipment-level emissions report with traceable inputs and a cited methodology, on the spot?

Emissions is the newest of the six questions and the one most at risk of being evaluated against the wrong criteria. The quality of the report is a function of the quality of the inputs, and the inputs live inside the execution data. Buyers need a methodology answer here, not a prettier dashboard. Numbers computed from the shipment record, against a recognized methodology, with a documented handling for cases where carrier data is thin, are numbers a receiver's procurement team can interrogate. Modal averages and aggregated distance produce an estimate that usually fails at the first serious question.

The live test is an audit: one specific shipment, reported end to end, with the methodology cited, the inputs traced to their source, and an explicit handling named for the input the carrier did not supply. If the vendor cannot produce that on the spot, the emissions number will become a commercial problem when larger receivers start asking for the workings behind the figure.

Taken together, the six questions turn the Evidence-Grade Delivery standard into a working conversation a B2B buyer can run against any platform, including nShift. The next step is to run that conversation on the buyer's own shipments, lanes, and commercial constraints. The next chapter describes the pilot that does that.

Section 6

Pilot and proof of fit

A Tuesday in the pilot review room

The deployment team has spent three days inside a sandbox. The feature list is ticked. The procurement template is populated. The sandbox shows a shipment moving through the platform cleanly, and the document that comes out reads as approval. What the document does not show is whether the six evidence elements produce themselves on the buyer's own production shipments, at the buyer's own production rate, when a carrier goes quiet or a dock refuses a window. The first production quarter surfaces those gaps. The pilot was notionally designed to surface them and did not.

A demo cannot answer whether Evidence-Grade Delivery is produced on the buyer's own shipments.

The six-question framework turns into a decision the moment a platform is put in front of real shipments. A pilot designed around the evidence standard is the shortest path between the six questions and a commitment, because the pilot tests whether the six elements actually produce themselves on the buyer's own shipments, in the buyer's own lanes, with the buyer's own carriers. A pilot testing the feature list rather than the evidence standard ends in a decision shaped by the demo, and the first production quarter is where the gaps show up.

Design

Designing the pilot

A pilot scoped to produce a decision is narrow. It takes one archetype, one lane, and one of the five control-loss points from section 4, and runs the six evidence elements as the tests across a working window. For pallet freight into retail, that usually means one named receiver and a test against visibility gaps or claims assembly. Multimodal flows need a different slice: one origin, one destination, two legs, and a test against exception data or cross-system drift.

The choice of scope is itself informative. The pilot is most useful when it runs on the archetype that has generated the last three difficult conversations with a receiver, the lane that sits at the center of a commercial review, and the control-loss point that has pulled the most reconciliation time out of the team inside the last quarter. Picking a harder scope produces a pilot the team can act on.

Criteria

Criteria against evidence

The criteria are written against the six evidence elements. Inside the pilot window, on production shipments, the platform produces order-level records against every shipment, a normalized milestone view across the carriers running the lane, documents generated from the shipment record, exception detail captured at the point of exception, a claims pack retrievable as one unit on a test claim, and an emissions number computed from the execution data. Each of those is observable on the shipments themselves. Each either happens at production rate or it does not.

Writing the criteria against evidence rather than against features matters in practice. A feature can tick a box in a demo and still fail to produce the POD, the exception record, or the claims pack under production load. At exit, the conversation starts with what the platform actually produced, not with what it showed.

Exit logic

Named at the start, before the pilot begins.

A pass is all six evidence elements produced against the scoped archetype and lane, at a volume consistent with the shipper's production rate. A fail is any element not produced, or produced only through manual assembly by the shipper's team. A partial result is named as partial, with the missing element identified, and the buyer decides on the partial with eyes open. Sign-off sits with the operations leader. The procurement signature and the deployment signature follow the operational one rather than leading it, because the question the pilot is answering is an operating question.

Proof · Mobile Climate Control

Mobile Climate Control's logistics, transport, and finance teams selected nShift TMS to standardise shipment ordering across production sites spanning Sweden, Canada, China, the US, South Africa, and Poland. Before the platform, bookings moved through emails, phone calls, and separate carrier portals. The effect was visible in the first months of operation.

After the first months of the solution being in operation, it was already possible to conclude that the processes are simpler to complete and more intuitive. Measurable savings in time are also evident, which translate into financial benefits, as well as benefits resulting from analyzing reports.

Customer context

Mobile Climate Control

Global HVAC manufacturer with a major production site in Oława, Poland. Logistics, transport, and finance teams using nShift TMS to standardise shipment ordering and reporting.

Mariusz Król Managing Director and President for Europe, Mobile Climate Control

A scoped pilot against one archetype, one lane, and one control-loss point is the cleanest way to see whether the operating model holds inside a specific operation.

Appendix

Future signals

In 2026, three signals are already raising the cost of not reaching Evidence-Grade Delivery. None changes what the standard means. All three are visible enough to be named.

Overview

The three signals

Signal 1

Agentic commerce on the buyer side

Consumer-side buying agents are already visible in retail flows, placing purchases on behalf of users against criteria and budgets they have been given. The same pattern, in a slower register, is arriving in B2B procurement. Agents compare suppliers, submit orders against a catalog, and in some categories already select delivery options inside the flow. For a shipper, the question is what the agent reads when it places the order.

Agents read structured data. An agent evaluating a supplier will ask for shipment-level evidence, lane-level performance, emissions numbers, and claim history in a machine-readable form. Evidence that already sits against the shipment record is in the shape the agent can consume. Evidence assembled by hand after the fact will not surface, because the agent will not wait for the assembly, and the assembly is not a readable artifact by the time it arrives.

Signal 2

Structured digital freight information

The eFTI Regulation is moving freight information away from paper and toward structured electronic data across road, rail, inland waterways, and air. The pattern mirrors ICS2 on the customs side: data must be in the right shape before the physical movement, and must be retrievable by the relevant authority in that shape afterwards. What began with ICS2 customs data is now the same expectation across road, rail, inland waterways, and air.

Operations that already generate freight information from the shipment record will recognize the regulation as formalizing what they already do. Operations still assembling that information the night before a movement will feel it as schedule pressure that exposes the gap. The cost of arriving late shows up in holding times, border dwell, rebookings, and the conversations with receivers that follow those delays.

Signal 3

Emissions reporting as a buyer requirement

Emissions reporting has moved inside a short period from a sustainability-team concern into a procurement concern. Large receivers are asking for shipment-level emissions as part of the supplier evaluation, alongside price, service level, and lead time. Procurement is converging on reportable numbers that sit inside the commercial conversation, at the shipment level, against a recognized methodology, auditable on request by the people writing the procurement questionnaires.

An emissions report that reads off the shipment record can be produced on demand, and the inputs can be defended. A report built from estimates will meet a procurement question the estimate does not survive. The difference starts as a tiebreaker on lanes where two shippers are otherwise equivalent, and then, inside larger receivers, hardens into a qualifier. The reports that fail will fail quietly, at the stage before the shortlist forms, and the commercial consequence will follow without a visible trigger.

Common thread

What the signals share

All three signals point in the same direction. Agentic buyers will read the shipment record directly. Freight information regulation will require data that is already structured. Procurement teams will ask for emissions numbers they can audit. The common thread is that evidence produced at the point of execution holds up, and evidence assembled after the fact does not. For operations already meeting the standard, these signals confirm the investment.

The window narrows for operations still reconstructing evidence after the fact.

Results from manufacturers, distributors, and logistics teams

Unilever-black

800
pallets/day

dispatched on schedule by Unilever Sweden, often within 30-minute delivery windows
ovako-logo-400x200-1

700,000
tons/year

of steel shipped by Ovako through 20 carriers using one platform
asmet-logo-1

40,000+
SKUs

managed by Asmet with full logistics agility, seamless ERP integration and wide carrier coverage
MCC_Colour_Logo_RGB-symbol-1

Multi-continent
operations

standardized within months by Mobile Climate Control, replacing fragmented booking

Frequently asked questions

What is Evidence-Grade Delivery?

Evidence-Grade Delivery is a shipment-level standard. It is built on six elements: an order-level link, normalized milestone data, documents generated from the shipment record, exception detail captured at the point of exception, a claims pack retrievable as one unit, and emissions data tied to the shipment. When a shipment meets the standard, the record answers any question asked of it without anyone rebuilding the story by hand.

How is B2B shipping different from B2C shipping?

Three clocks. The contract runs on negotiated SLAs between procurement and an account manager. The delivery runs on a booked dock appointment with a named receiver. The evidence runs on what the carrier, driver, and receiver sign for at handover. Consumer delivery playbooks address none of the three, which is why they break when applied to pallets.

What are the five B2B shipment archetypes?

The guide focuses on five shipment archetypes that create different evidence demands: multi-parcel B2B, pallet freight, multimodal, cross-border, and dangerous goods / temperature-sensitive shipments. Most B2B operations run more than one of these at once, which is why a single undifferentiated shipping setup usually leaks control.

What should I ask when evaluating a delivery management platform?

Six questions built for a working vendor session. Can the platform produce every evidence element against one shipment record? Can it broker your carrier mix without a rebuild? Can it survive a claim audit, an eFTI inspection, and an emissions defense? Can your ops lead verify all of that in one session? The guide walks each question with what a passing answer sounds like.

How do I pilot a B2B delivery platform before committing?

Scope the pilot against the evidence standard. Pick one archetype from your mix, one lane you know is leaky, and run three weeks. If the shipment record that comes out the other side can answer a claim, a dock dispute, and an emissions audit without a phone call, the platform has proven fit.