|  |  | 
|  | ------- | 
|  | PHY Abstraction Layer | 
|  | (Updated 2008-04-08) | 
|  |  | 
|  | Purpose | 
|  |  | 
|  | Most network devices consist of set of registers which provide an interface | 
|  | to a MAC layer, which communicates with the physical connection through a | 
|  | PHY.  The PHY concerns itself with negotiating link parameters with the link | 
|  | partner on the other side of the network connection (typically, an ethernet | 
|  | cable), and provides a register interface to allow drivers to determine what | 
|  | settings were chosen, and to configure what settings are allowed. | 
|  |  | 
|  | While these devices are distinct from the network devices, and conform to a | 
|  | standard layout for the registers, it has been common practice to integrate | 
|  | the PHY management code with the network driver.  This has resulted in large | 
|  | amounts of redundant code.  Also, on embedded systems with multiple (and | 
|  | sometimes quite different) ethernet controllers connected to the same | 
|  | management bus, it is difficult to ensure safe use of the bus. | 
|  |  | 
|  | Since the PHYs are devices, and the management busses through which they are | 
|  | accessed are, in fact, busses, the PHY Abstraction Layer treats them as such. | 
|  | In doing so, it has these goals: | 
|  |  | 
|  | 1) Increase code-reuse | 
|  | 2) Increase overall code-maintainability | 
|  | 3) Speed development time for new network drivers, and for new systems | 
|  |  | 
|  | Basically, this layer is meant to provide an interface to PHY devices which | 
|  | allows network driver writers to write as little code as possible, while | 
|  | still providing a full feature set. | 
|  |  | 
|  | The MDIO bus | 
|  |  | 
|  | Most network devices are connected to a PHY by means of a management bus. | 
|  | Different devices use different busses (though some share common interfaces). | 
|  | In order to take advantage of the PAL, each bus interface needs to be | 
|  | registered as a distinct device. | 
|  |  | 
|  | 1) read and write functions must be implemented.  Their prototypes are: | 
|  |  | 
|  | int write(struct mii_bus *bus, int mii_id, int regnum, u16 value); | 
|  | int read(struct mii_bus *bus, int mii_id, int regnum); | 
|  |  | 
|  | mii_id is the address on the bus for the PHY, and regnum is the register | 
|  | number.  These functions are guaranteed not to be called from interrupt | 
|  | time, so it is safe for them to block, waiting for an interrupt to signal | 
|  | the operation is complete | 
|  |  | 
|  | 2) A reset function is optional.  This is used to return the bus to an | 
|  | initialized state. | 
|  |  | 
|  | 3) A probe function is needed.  This function should set up anything the bus | 
|  | driver needs, setup the mii_bus structure, and register with the PAL using | 
|  | mdiobus_register.  Similarly, there's a remove function to undo all of | 
|  | that (use mdiobus_unregister). | 
|  |  | 
|  | 4) Like any driver, the device_driver structure must be configured, and init | 
|  | exit functions are used to register the driver. | 
|  |  | 
|  | 5) The bus must also be declared somewhere as a device, and registered. | 
|  |  | 
|  | As an example for how one driver implemented an mdio bus driver, see | 
|  | drivers/net/ethernet/freescale/fsl_pq_mdio.c and an associated DTS file | 
|  | for one of the users. (e.g. "git grep fsl,.*-mdio arch/powerpc/boot/dts/") | 
|  |  | 
|  | (RG)MII/electrical interface considerations | 
|  |  | 
|  | The Reduced Gigabit Medium Independent Interface (RGMII) is a 12-pin | 
|  | electrical signal interface using a synchronous 125Mhz clock signal and several | 
|  | data lines. Due to this design decision, a 1.5ns to 2ns delay must be added | 
|  | between the clock line (RXC or TXC) and the data lines to let the PHY (clock | 
|  | sink) have enough setup and hold times to sample the data lines correctly. The | 
|  | PHY library offers different types of PHY_INTERFACE_MODE_RGMII* values to let | 
|  | the PHY driver and optionally the MAC driver, implement the required delay. The | 
|  | values of phy_interface_t must be understood from the perspective of the PHY | 
|  | device itself, leading to the following: | 
|  |  | 
|  | * PHY_INTERFACE_MODE_RGMII: the PHY is not responsible for inserting any | 
|  | internal delay by itself, it assumes that either the Ethernet MAC (if capable | 
|  | or the PCB traces) insert the correct 1.5-2ns delay | 
|  |  | 
|  | * PHY_INTERFACE_MODE_RGMII_TXID: the PHY should insert an internal delay | 
|  | for the transmit data lines (TXD[3:0]) processed by the PHY device | 
|  |  | 
|  | * PHY_INTERFACE_MODE_RGMII_RXID: the PHY should insert an internal delay | 
|  | for the receive data lines (RXD[3:0]) processed by the PHY device | 
|  |  | 
|  | * PHY_INTERFACE_MODE_RGMII_ID: the PHY should insert internal delays for | 
|  | both transmit AND receive data lines from/to the PHY device | 
|  |  | 
|  | Whenever possible, use the PHY side RGMII delay for these reasons: | 
|  |  | 
|  | * PHY devices may offer sub-nanosecond granularity in how they allow a | 
|  | receiver/transmitter side delay (e.g: 0.5, 1.0, 1.5ns) to be specified. Such | 
|  | precision may be required to account for differences in PCB trace lengths | 
|  |  | 
|  | * PHY devices are typically qualified for a large range of applications | 
|  | (industrial, medical, automotive...), and they provide a constant and | 
|  | reliable delay across temperature/pressure/voltage ranges | 
|  |  | 
|  | * PHY device drivers in PHYLIB being reusable by nature, being able to | 
|  | configure correctly a specified delay enables more designs with similar delay | 
|  | requirements to be operate correctly | 
|  |  | 
|  | For cases where the PHY is not capable of providing this delay, but the | 
|  | Ethernet MAC driver is capable of doing so, the correct phy_interface_t value | 
|  | should be PHY_INTERFACE_MODE_RGMII, and the Ethernet MAC driver should be | 
|  | configured correctly in order to provide the required transmit and/or receive | 
|  | side delay from the perspective of the PHY device. Conversely, if the Ethernet | 
|  | MAC driver looks at the phy_interface_t value, for any other mode but | 
|  | PHY_INTERFACE_MODE_RGMII, it should make sure that the MAC-level delays are | 
|  | disabled. | 
|  |  | 
|  | In case neither the Ethernet MAC, nor the PHY are capable of providing the | 
|  | required delays, as defined per the RGMII standard, several options may be | 
|  | available: | 
|  |  | 
|  | * Some SoCs may offer a pin pad/mux/controller capable of configuring a given | 
|  | set of pins'strength, delays, and voltage; and it may be a suitable | 
|  | option to insert the expected 2ns RGMII delay. | 
|  |  | 
|  | * Modifying the PCB design to include a fixed delay (e.g: using a specifically | 
|  | designed serpentine), which may not require software configuration at all. | 
|  |  | 
|  | Common problems with RGMII delay mismatch | 
|  |  | 
|  | When there is a RGMII delay mismatch between the Ethernet MAC and the PHY, this | 
|  | will most likely result in the clock and data line signals to be unstable when | 
|  | the PHY or MAC take a snapshot of these signals to translate them into logical | 
|  | 1 or 0 states and reconstruct the data being transmitted/received. Typical | 
|  | symptoms include: | 
|  |  | 
|  | * Transmission/reception partially works, and there is frequent or occasional | 
|  | packet loss observed | 
|  |  | 
|  | * Ethernet MAC may report some or all packets ingressing with a FCS/CRC error, | 
|  | or just discard them all | 
|  |  | 
|  | * Switching to lower speeds such as 10/100Mbits/sec makes the problem go away | 
|  | (since there is enough setup/hold time in that case) | 
|  |  | 
|  |  | 
|  | Connecting to a PHY | 
|  |  | 
|  | Sometime during startup, the network driver needs to establish a connection | 
|  | between the PHY device, and the network device.  At this time, the PHY's bus | 
|  | and drivers need to all have been loaded, so it is ready for the connection. | 
|  | At this point, there are several ways to connect to the PHY: | 
|  |  | 
|  | 1) The PAL handles everything, and only calls the network driver when | 
|  | the link state changes, so it can react. | 
|  |  | 
|  | 2) The PAL handles everything except interrupts (usually because the | 
|  | controller has the interrupt registers). | 
|  |  | 
|  | 3) The PAL handles everything, but checks in with the driver every second, | 
|  | allowing the network driver to react first to any changes before the PAL | 
|  | does. | 
|  |  | 
|  | 4) The PAL serves only as a library of functions, with the network device | 
|  | manually calling functions to update status, and configure the PHY | 
|  |  | 
|  |  | 
|  | Letting the PHY Abstraction Layer do Everything | 
|  |  | 
|  | If you choose option 1 (The hope is that every driver can, but to still be | 
|  | useful to drivers that can't), connecting to the PHY is simple: | 
|  |  | 
|  | First, you need a function to react to changes in the link state.  This | 
|  | function follows this protocol: | 
|  |  | 
|  | static void adjust_link(struct net_device *dev); | 
|  |  | 
|  | Next, you need to know the device name of the PHY connected to this device. | 
|  | The name will look something like, "0:00", where the first number is the | 
|  | bus id, and the second is the PHY's address on that bus.  Typically, | 
|  | the bus is responsible for making its ID unique. | 
|  |  | 
|  | Now, to connect, just call this function: | 
|  |  | 
|  | phydev = phy_connect(dev, phy_name, &adjust_link, interface); | 
|  |  | 
|  | phydev is a pointer to the phy_device structure which represents the PHY.  If | 
|  | phy_connect is successful, it will return the pointer.  dev, here, is the | 
|  | pointer to your net_device.  Once done, this function will have started the | 
|  | PHY's software state machine, and registered for the PHY's interrupt, if it | 
|  | has one.  The phydev structure will be populated with information about the | 
|  | current state, though the PHY will not yet be truly operational at this | 
|  | point. | 
|  |  | 
|  | PHY-specific flags should be set in phydev->dev_flags prior to the call | 
|  | to phy_connect() such that the underlying PHY driver can check for flags | 
|  | and perform specific operations based on them. | 
|  | This is useful if the system has put hardware restrictions on | 
|  | the PHY/controller, of which the PHY needs to be aware. | 
|  |  | 
|  | interface is a u32 which specifies the connection type used | 
|  | between the controller and the PHY.  Examples are GMII, MII, | 
|  | RGMII, and SGMII.  For a full list, see include/linux/phy.h | 
|  |  | 
|  | Now just make sure that phydev->supported and phydev->advertising have any | 
|  | values pruned from them which don't make sense for your controller (a 10/100 | 
|  | controller may be connected to a gigabit capable PHY, so you would need to | 
|  | mask off SUPPORTED_1000baseT*).  See include/linux/ethtool.h for definitions | 
|  | for these bitfields. Note that you should not SET any bits, except the | 
|  | SUPPORTED_Pause and SUPPORTED_AsymPause bits (see below), or the PHY may get | 
|  | put into an unsupported state. | 
|  |  | 
|  | Lastly, once the controller is ready to handle network traffic, you call | 
|  | phy_start(phydev).  This tells the PAL that you are ready, and configures the | 
|  | PHY to connect to the network.  If you want to handle your own interrupts, | 
|  | just set phydev->irq to PHY_IGNORE_INTERRUPT before you call phy_start. | 
|  | Similarly, if you don't want to use interrupts, set phydev->irq to PHY_POLL. | 
|  |  | 
|  | When you want to disconnect from the network (even if just briefly), you call | 
|  | phy_stop(phydev). | 
|  |  | 
|  | Pause frames / flow control | 
|  |  | 
|  | The PHY does not participate directly in flow control/pause frames except by | 
|  | making sure that the SUPPORTED_Pause and SUPPORTED_AsymPause bits are set in | 
|  | MII_ADVERTISE to indicate towards the link partner that the Ethernet MAC | 
|  | controller supports such a thing. Since flow control/pause frames generation | 
|  | involves the Ethernet MAC driver, it is recommended that this driver takes care | 
|  | of properly indicating advertisement and support for such features by setting | 
|  | the SUPPORTED_Pause and SUPPORTED_AsymPause bits accordingly. This can be done | 
|  | either before or after phy_connect() and/or as a result of implementing the | 
|  | ethtool::set_pauseparam feature. | 
|  |  | 
|  |  | 
|  | Keeping Close Tabs on the PAL | 
|  |  | 
|  | It is possible that the PAL's built-in state machine needs a little help to | 
|  | keep your network device and the PHY properly in sync.  If so, you can | 
|  | register a helper function when connecting to the PHY, which will be called | 
|  | every second before the state machine reacts to any changes.  To do this, you | 
|  | need to manually call phy_attach() and phy_prepare_link(), and then call | 
|  | phy_start_machine() with the second argument set to point to your special | 
|  | handler. | 
|  |  | 
|  | Currently there are no examples of how to use this functionality, and testing | 
|  | on it has been limited because the author does not have any drivers which use | 
|  | it (they all use option 1).  So Caveat Emptor. | 
|  |  | 
|  | Doing it all yourself | 
|  |  | 
|  | There's a remote chance that the PAL's built-in state machine cannot track | 
|  | the complex interactions between the PHY and your network device.  If this is | 
|  | so, you can simply call phy_attach(), and not call phy_start_machine or | 
|  | phy_prepare_link().  This will mean that phydev->state is entirely yours to | 
|  | handle (phy_start and phy_stop toggle between some of the states, so you | 
|  | might need to avoid them). | 
|  |  | 
|  | An effort has been made to make sure that useful functionality can be | 
|  | accessed without the state-machine running, and most of these functions are | 
|  | descended from functions which did not interact with a complex state-machine. | 
|  | However, again, no effort has been made so far to test running without the | 
|  | state machine, so tryer beware. | 
|  |  | 
|  | Here is a brief rundown of the functions: | 
|  |  | 
|  | int phy_read(struct phy_device *phydev, u16 regnum); | 
|  | int phy_write(struct phy_device *phydev, u16 regnum, u16 val); | 
|  |  | 
|  | Simple read/write primitives.  They invoke the bus's read/write function | 
|  | pointers. | 
|  |  | 
|  | void phy_print_status(struct phy_device *phydev); | 
|  |  | 
|  | A convenience function to print out the PHY status neatly. | 
|  |  | 
|  | int phy_start_interrupts(struct phy_device *phydev); | 
|  | int phy_stop_interrupts(struct phy_device *phydev); | 
|  |  | 
|  | Requests the IRQ for the PHY interrupts, then enables them for | 
|  | start, or disables then frees them for stop. | 
|  |  | 
|  | struct phy_device * phy_attach(struct net_device *dev, const char *phy_id, | 
|  | phy_interface_t interface); | 
|  |  | 
|  | Attaches a network device to a particular PHY, binding the PHY to a generic | 
|  | driver if none was found during bus initialization. | 
|  |  | 
|  | int phy_start_aneg(struct phy_device *phydev); | 
|  |  | 
|  | Using variables inside the phydev structure, either configures advertising | 
|  | and resets autonegotiation, or disables autonegotiation, and configures | 
|  | forced settings. | 
|  |  | 
|  | static inline int phy_read_status(struct phy_device *phydev); | 
|  |  | 
|  | Fills the phydev structure with up-to-date information about the current | 
|  | settings in the PHY. | 
|  |  | 
|  | int phy_ethtool_sset(struct phy_device *phydev, struct ethtool_cmd *cmd); | 
|  |  | 
|  | Ethtool convenience functions. | 
|  |  | 
|  | int phy_mii_ioctl(struct phy_device *phydev, | 
|  | struct mii_ioctl_data *mii_data, int cmd); | 
|  |  | 
|  | The MII ioctl.  Note that this function will completely screw up the state | 
|  | machine if you write registers like BMCR, BMSR, ADVERTISE, etc.  Best to | 
|  | use this only to write registers which are not standard, and don't set off | 
|  | a renegotiation. | 
|  |  | 
|  |  | 
|  | PHY Device Drivers | 
|  |  | 
|  | With the PHY Abstraction Layer, adding support for new PHYs is | 
|  | quite easy.  In some cases, no work is required at all!  However, | 
|  | many PHYs require a little hand-holding to get up-and-running. | 
|  |  | 
|  | Generic PHY driver | 
|  |  | 
|  | If the desired PHY doesn't have any errata, quirks, or special | 
|  | features you want to support, then it may be best to not add | 
|  | support, and let the PHY Abstraction Layer's Generic PHY Driver | 
|  | do all of the work. | 
|  |  | 
|  | Writing a PHY driver | 
|  |  | 
|  | If you do need to write a PHY driver, the first thing to do is | 
|  | make sure it can be matched with an appropriate PHY device. | 
|  | This is done during bus initialization by reading the device's | 
|  | UID (stored in registers 2 and 3), then comparing it to each | 
|  | driver's phy_id field by ANDing it with each driver's | 
|  | phy_id_mask field.  Also, it needs a name.  Here's an example: | 
|  |  | 
|  | static struct phy_driver dm9161_driver = { | 
|  | .phy_id         = 0x0181b880, | 
|  | .name           = "Davicom DM9161E", | 
|  | .phy_id_mask    = 0x0ffffff0, | 
|  | ... | 
|  | } | 
|  |  | 
|  | Next, you need to specify what features (speed, duplex, autoneg, | 
|  | etc) your PHY device and driver support.  Most PHYs support | 
|  | PHY_BASIC_FEATURES, but you can look in include/mii.h for other | 
|  | features. | 
|  |  | 
|  | Each driver consists of a number of function pointers, documented | 
|  | in include/linux/phy.h under the phy_driver structure. | 
|  |  | 
|  | Of these, only config_aneg and read_status are required to be | 
|  | assigned by the driver code.  The rest are optional.  Also, it is | 
|  | preferred to use the generic phy driver's versions of these two | 
|  | functions if at all possible: genphy_read_status and | 
|  | genphy_config_aneg.  If this is not possible, it is likely that | 
|  | you only need to perform some actions before and after invoking | 
|  | these functions, and so your functions will wrap the generic | 
|  | ones. | 
|  |  | 
|  | Feel free to look at the Marvell, Cicada, and Davicom drivers in | 
|  | drivers/net/phy/ for examples (the lxt and qsemi drivers have | 
|  | not been tested as of this writing). | 
|  |  | 
|  | The PHY's MMD register accesses are handled by the PAL framework | 
|  | by default, but can be overridden by a specific PHY driver if | 
|  | required. This could be the case if a PHY was released for | 
|  | manufacturing before the MMD PHY register definitions were | 
|  | standardized by the IEEE. Most modern PHYs will be able to use | 
|  | the generic PAL framework for accessing the PHY's MMD registers. | 
|  | An example of such usage is for Energy Efficient Ethernet support, | 
|  | implemented in the PAL. This support uses the PAL to access MMD | 
|  | registers for EEE query and configuration if the PHY supports | 
|  | the IEEE standard access mechanisms, or can use the PHY's specific | 
|  | access interfaces if overridden by the specific PHY driver. See | 
|  | the Micrel driver in drivers/net/phy/ for an example of how this | 
|  | can be implemented. | 
|  |  | 
|  | Board Fixups | 
|  |  | 
|  | Sometimes the specific interaction between the platform and the PHY requires | 
|  | special handling.  For instance, to change where the PHY's clock input is, | 
|  | or to add a delay to account for latency issues in the data path.  In order | 
|  | to support such contingencies, the PHY Layer allows platform code to register | 
|  | fixups to be run when the PHY is brought up (or subsequently reset). | 
|  |  | 
|  | When the PHY Layer brings up a PHY it checks to see if there are any fixups | 
|  | registered for it, matching based on UID (contained in the PHY device's phy_id | 
|  | field) and the bus identifier (contained in phydev->dev.bus_id).  Both must | 
|  | match, however two constants, PHY_ANY_ID and PHY_ANY_UID, are provided as | 
|  | wildcards for the bus ID and UID, respectively. | 
|  |  | 
|  | When a match is found, the PHY layer will invoke the run function associated | 
|  | with the fixup.  This function is passed a pointer to the phy_device of | 
|  | interest.  It should therefore only operate on that PHY. | 
|  |  | 
|  | The platform code can either register the fixup using phy_register_fixup(): | 
|  |  | 
|  | int phy_register_fixup(const char *phy_id, | 
|  | u32 phy_uid, u32 phy_uid_mask, | 
|  | int (*run)(struct phy_device *)); | 
|  |  | 
|  | Or using one of the two stubs, phy_register_fixup_for_uid() and | 
|  | phy_register_fixup_for_id(): | 
|  |  | 
|  | int phy_register_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask, | 
|  | int (*run)(struct phy_device *)); | 
|  | int phy_register_fixup_for_id(const char *phy_id, | 
|  | int (*run)(struct phy_device *)); | 
|  |  | 
|  | The stubs set one of the two matching criteria, and set the other one to | 
|  | match anything. | 
|  |  | 
|  | When phy_register_fixup() or *_for_uid()/*_for_id() is called at module, | 
|  | unregister fixup and free allocate memory are required. | 
|  |  | 
|  | Call one of following function before unloading module. | 
|  |  | 
|  | int phy_unregister_fixup(const char *phy_id, u32 phy_uid, u32 phy_uid_mask); | 
|  | int phy_unregister_fixup_for_uid(u32 phy_uid, u32 phy_uid_mask); | 
|  | int phy_register_fixup_for_id(const char *phy_id); | 
|  |  | 
|  | Standards | 
|  |  | 
|  | IEEE Standard 802.3: CSMA/CD Access Method and Physical Layer Specifications, Section Two: | 
|  | http://standards.ieee.org/getieee802/download/802.3-2008_section2.pdf | 
|  |  | 
|  | RGMII v1.3: | 
|  | http://web.archive.org/web/20160303212629/http://www.hp.com/rnd/pdfs/RGMIIv1_3.pdf | 
|  |  | 
|  | RGMII v2.0: | 
|  | http://web.archive.org/web/20160303171328/http://www.hp.com/rnd/pdfs/RGMIIv2_0_final_hp.pdf |