Copyright WILEY-VCH Verlag GmbH & Co. KGaA, 69469 Weinheim, Germany, 2020.

# ADVANCED MATERIALS

Supporting Information

for Adv. Mater., DOI: 10.1002/adma.202002431

An Atomically Thin Optoelectronic Machine Vision Processor

Houk Jang, Chengye Liu, Henry Hinton, Min-Hyun Lee, Haeryong Kim, Minsu Seol, Hyeon-Jin Shin, Seongjun Park,\* and Donhee Ham\* Copyright WILEY-VCH Verlag GmbH & Co. KGaA, 69469 Weinheim, Germany, 2020.

# **Supporting Information**

## Atomically thin optoelectronic machine vision processor

Houk Jang, Chengye Liu, Henry Hinton, Min-Hyun Lee, Haeryong Kim, Minsu Seol, Hyeon-Jin Shin, Seongjun Park\*, Donhee Ham\*

Method S1. Monolayer MoS<sub>2</sub> characterization.

Monolayer  $MoS_2$  is synthesized by metal-organic chemical vapor deposition (MOCVD) on a 6-inch silicon dioxide (SiO<sub>2</sub>) wafer. Two local spectral analyses, PL (Figure S1a) and Raman (Figure S1b) spectra, were performed and a cross-sectional TEM image (Figure S1c) was taken to verify its monolayer thickness. The PL spectrum exhibits a clear, direct band-gap transition at 1.87 eV without any evidence of an indirect band-gap transition, which agrees well with the previously-reported properties of monolayer  $MoS_2$ .<sup>[1]</sup> Furthermore, the Raman spectrum exhibits a narrow difference between  $A_{1g}$  and  $E_{2g}$  (~19.5 cm<sup>2</sup>), which also agrees with the previous report.<sup>[2]</sup> The cross-sectional TEM image provides direct and intuitive evidence of a monolayer. In addition to these local measurements, the PL intensity map (Figure S1d) exhibits uniform PL peak intensity, proving uniformity of the synthesized monolayer  $MoS_2$  over 6-inch.

#### Method S2. Device fabrication

All the MoS<sub>2</sub> photo-FETs share the same device structure: channel length/width ratio of 20  $\mu$ m/50  $\mu$ m, 30 nm Al<sub>2</sub>O<sub>3</sub> gate dielectric, and 1.5  $\mu$ m process margin for misalignment. Each photo-FET has a backgated structure to allow for top-down exposure. One photo-FET comprises a unit pixel, which is repeated every 300  $\mu$ m horizontally and vertically to form an array. Fabrication began with wafer cleaning by dipping 4-inch p-type (100) Si wafers with 300 nm thermal oxide in Piranha solution ( $H_2SO_4$ : $H_2O_2$  = 3:1) at 80 °C for 10 min. Row interconnects (Cr/Au = 10 nm/90 nm) were formed by a conventional lift-off process. LOR 3A was employed for all lift-off processes to increase fabrication yield. All metal deposition was done with thermal evaporator (TE) at a rate of 0.5 Å/sec. The inter-layer dielectric (ILD; SiO<sub>2</sub>, 200 nm) was deposited by plasma-enhanced chemical vapor deposition (PECVD) at 250 °C. After formation of vias through the ILD by conventional photolithography and a wet etching process using buffered oxide etchant (BOE; 6:1), column interconnects (Cr/Au = 10 nm/90 nm) and transistor gate electrodes (Cr/Au = 10 nm/25 nm) were formed by a lift-off process. The gate dielectric (Al<sub>2</sub>O<sub>3</sub>, 30 nm) was formed using atomic layer deposition (ALD, Ar as carrier gas, TMA as precursor, H<sub>2</sub>O as oxidant) at 250 °C, followed by via formation by photolithography and wet etching processes. After a surface cleaning process on the chip using O<sub>2</sub> plasma (500 Watt, 200 °C for 1 min), a sheet of MoS<sub>2</sub> supported by a PMMA layer was transferred onto the chip. The MoS<sub>2</sub> channel for each pixel was isolated via conventional photolithography and reactive ion etching (RIE, O<sub>2</sub> 2 sccm, Ar 5 sccm, CHF<sub>3</sub> 10 sccm, RF power 30 W, plate power 10 W). S/D electrodes (Au = 35 nm) were formed by a lift-off process and devices were encapsulated with 30 nm Al<sub>2</sub>O<sub>3</sub> via ALD at 200 °C. Regions over the contact pads in the top encapsulation layer were opened to permit wire bonding.

#### Method S3. PCB design

The PCB interface uses four 16-channel, 12-bit DACs (AD5767, Analog Devices) to provide bias voltages to the 32 drain and 32 gate rows. Likewise, four 8-channel, 20-bit, switched-capacitor ADCs (DDC118, Texas Instruments) measure the currents at each column. A microcontroller (Teensy 3.6 with NXP MK66FX1M0) handles the serial interface for each peripheral device, as well as the transmission of processed data to the recording computer (Figure S2a and S3).

#### Method S4. Image calibration

To calibrate out the photo-response and PPC variations from the raw images, we perform the 65-second duration image sensing (same as other image capture process) for each of 3 solid color input images (red, green, and blue) with 12 evenly-spaced intensities between 0 to 255 (Figure S9a). This generates a look-up table for automated 12-point calibration for each color at each pixel at all frame times (two calibration examples for a pixel, namely row 22, col 14, at 0 s, the moment the shutter is closed, and at 60 s after the shutter is closed, are shown in Figs. S9b and c). This calibration is performed for all  $30 \times 30$  pixels using look-up tables for each pixel (Fig. S9d). Finally, as an independent step towards obtaining the images in the figures, for each of the 14 dead pixels in the  $30 \times 30$  array, we use the average value of the calibrated intensities from the neighboring pixels.

#### Method S5. Optical programming of conductance matrix

The optical programming of the conductance matrix was performed in the following order: 1) crossbar array reset, 2) rows and columns selection, 3) selection of reference pixels and 4)

iterative encoding. In detail, 1) the crossbar array was first reset by applying a global  $V_G$  of 7 V for 10 seconds, reverting each FET to a nonuniform initial conductance level. After resetting, the crossbar array was set to operate in the subthreshold region ( $V_G = -6$  V with  $V_D = 0.1$  V). 2) With the nonuniform initial conductance level, we partitioned a subarray with desired size. All possible partition plans were considered, and the selected arrangement was chosen to yield maximal uniformity between FET conductance values in each selected column. 3) One reference pixel FET was selected for each column prior to the start of the programming process. The reference pixel is the one with largest  $G - kw_{target}$  in each column, where G is pixel conductance,  $w_{target}$  is target weight, and k is a preselected value representing conductance per unit weight. In each subsequent iteration cycle, the reference pixel's conductance level was used as the reference levels of the other pixels in the column were obtained using the following equation:

$$G_{i,j} = k(w_{i,j} - w_{ref,j}) + G_{ref,j}$$

where  $G_{i,j}$  and  $G_{ref,j}$  represent the target conductance level of pixel in row i, column j, and the conductance of reference pixel of column j, respectively, similar to weights  $w_{i,j}$  and  $w_{ref,j}$ , and k is the preselected value representing conductance per unit weight (ranged from 1 nS to 4.5 nS). 4) During each cycle, the conductance levels of all FETs in the selected subarray were measured, and an image was projected onto the array for 5 seconds to trigger different amount of PPC for every different pixel. The projected image was well aligned to the crossbar array so that each pixel in the array was exposed to the intended light intensity. The incident light intensity for each pixel was set based on light intensity for last iterative cycle and the conductance error—the difference of pixel conductance from the target conductance—of the current encoding cycle with respect to that of the last cycle. After the 5-second-long image exposure, we waited for 10 seconds to allow enough time for PPC to settle prior to the next encoding cycle. The same procedure was repeated until the conductance error of each pixel was within the acceptable error range.

#### Method S6. Current to output conversion

To perform vector-matrix multiplication of a vector x (1 × m) and a matrix W (m × n) using the

crossbar array, the matrix W was encoded as a conductance matrix G (m × n) in a partitioned subarray, and the vector x was represented as a voltage vector v (1 × m) and applied to the drain row wires of the subarray. The product of input voltage vector v and the conductance matrix Gwas measured at the column wires of the subarray as a current vector i (1 × n). The product y(1 × n) of xW was recovered from the measured current vector i using the following equation:

$$y = \frac{\boldsymbol{i} - (\boldsymbol{G}_{ref} - k\boldsymbol{w}_{ref}) \|\boldsymbol{v}\|_1}{kC}$$

where  $G_{ref}$  and  $w_{ref}$  are vectors  $(1 \times n)$  of reference FETs' conductance values and the corresponding target weights, respectively (one reference FET was selected for each column when the subarray was encoded —see previous optical weight encoding section for more details),  $\|v\|_1$  denotes the L1-norm of the input voltage vector, which is equivalent to the sum of the input voltage magnitudes, *C* is a scaling factor to match input *x* to the desired voltage range, and *k* is a preselected value denoting conductance per unit weight.

#### Method S7. CNN software simulation

Target weights matrix *W* for the convolutional layer and the FC layer were obtained from offline training using Keras with TensorFlow backend. We used the Sequential model in Keras, and three layers were added to this model. The first layer is a Conv2D layer using 10 kernels, with each kernel of size  $4 \times 4$ , stride of 1, and a rectified linear unit (ReLU) activation function. This is followed by a MaxPooling2D layer with a  $5 \times 5$  pooling window size. The output of this max-pooling layer is flattened into a  $40 \times 1$  vector and fed into the next Dense layer with 10 outputs with softmax activation. Zero biases were imposed for both the convolutional and the FC layers. Simulation used 60,000 training images and 10,000 test images. All images were from the MNIST database and were rescaled to  $13 \times 13$  from their original size of  $28 \times 28$ . Training was carried out for 100 epochs with batch size of 1024.

#### Text S1. Photoresponse

We measured  $I_D$ - $V_D$ - $V_G$  map with and without incident light (532 nm, 1.4 W/m<sup>2</sup>, Figure S6a and b) to generate a photocurrent ( $I_{ph} = I_{D,Light} - I_{D,Dark}$ ) map (Figure S6c). Responsivity and

detectivity (Figure S6c and d) were calculated by the following equations:

Responsivity, 
$$R = \frac{I_{ph}}{P \cdot A}$$

where  $I_{ph}$  is the photocurrent, P is the intensity of the incident light per unit area, and A is the channel area of a phototransistor.

Detectivity, 
$$D^* = \frac{R \cdot A^{1/2}}{(2 \cdot e \cdot I_{dark})^{1/2}}$$

where *R* is the responsivity, *A* is the channel area of each phototransistor, *e* is the unit charge, and  $I_{dark}$  is the dark current.

#### Text S2. PPC effect

To characterize the PPC effect, we measured  $I_D$  (at  $V_D = 0.3$  V and  $V_G = -6$  V) for 4 minutes after applying incident light (532 nm, 1.4 W/m<sup>2</sup>) during the first one minute (Figure 2a). Significant PPC effect was observed after reverting to dark state during the period we measured for 3 min. PPC is often described by the random local potential fluctuation (RLPF) model where local potential fluctuations arise by randomly distributed intrinsic defects or extrinsic charge impurities. These fluctuations can trap either electrons or holes at local potential minima, preventing the trapped charges from undergoing recombination. The counterparts to these trapped charges then contribute to the photocurrent until recombination occurs, resulting in PPC. The decay behavior for this model follows stretched exponential decay, rather than single exponential decay. The equation for stretched exponential decay is as follows:

$$I_{PPC}(t) = I_0 \cdot exp\left[-\left(\frac{t}{\tau}\right)^{\beta}\right], \quad 0 < \beta < 1$$

where t is the time,  $I_0$  is the peak photocurrent,  $\tau$  is the decay constant and  $\beta$  is a scaling exponent. The normalized photocurrent is well-fitted with stretched exponential decay (with  $\tau$ of 2.4 msec ± 4.6 msec and  $\beta$  of 0.039 ± 0.002) showing expected signal of 20% after 10 years evaluated by extrapolation (Figure S7). We can electrically erase this PPC effect by applying a large gate voltage (e.g. 7 V in this report) to remove the trapped carriers. See Fig. S8 for the details. The magnitude of PPC is dependent on the amount of photogenerated carriers—partial portion of which is trapped to induced PPC—which relies on the amount, or dosage, of the absorbed photon. Therefore, the amount of PPC is dependent on the wavelength and intensity of the incident light as well as exposure time (Figure S10).

### References

[1] K. F. Mak, C. Lee, J. Hone, J. Shan, T. F. Heinz, Phys. Rev. Lett. 2010, 105, 136805.

[2] Y. Niu, S. Gonzalez-Abad, R. Frisenda, P. Marauhn, M. Drüppel, P. Gant, R. Schmidt, N. S.
Taghavi, D. Barcons, A. J. Molina-Mendoza, S. M. Vasconcellos, R. Bratschitsch, D. P. Lara,
M. Rohlfing, A. Castellanos-Gomez, *Nanomaterials* 2018, *8*, 725.



Figure S1. Characterization of monolayer  $MoS_2$  synthesized by MOCVD. a) PL spectrum. b) Raman spectrum. c) Cross-sectional TEM image. d) PL intensity mapping of the synthesized monolayer  $MoS_2$  with the PL measurements done at approximately 180 uniformly distributed locations across the 6-inch wafer.



Figure S2. Schematic diagrams of crossbar structure and current paths. a) Schematic of the integrated  $MoS_2$  photo-FET crossbar array (inner dashed box) and the custom PCB electronics (outer dashed box). The drain node of each FET is connected to the row-interconnect (red) and source node is connected to the column interconnect (black), forming a cross-point conductor between each row and column. Another 32 row interconnects are connected to the FETs' gate node. The PCB contains 64 DAC channels to provide bias or input signals for gate (blue) and drain (red) rows, and 32 ADC channels to collect column currents. Current paths (green) in the b) 'sensing mode' and the c) 'recognition mode'.



Figure S3. Printed circuit boards (PCBs) for measurement setup. Motherboard (front on top, back on bottom) and daughterboard PCBs, with component groups highlighted. The motherboard PCB contains four DACs for setting gate and drain voltages (16 channels each), and four, 8-channel source-side ADCs for measuring each column's current. Control switches are available for chip configurations that implement additional unit device selector switches. The daughterboard mounts on the motherboard via a 50-pin bus. The bus provides the motherboard with regulated supply voltages, as well as digital I/O for setting bias voltages and recording current data to stream over USB to a measurement computer.



Figure S4. Transfer characteristic analysis of  $MoS_2$  FETs array. a) Mean transfer curve (solid line) in linear (red, left axis) and in log (blue, right axis) scale with  $V_D$  of 0.3 V. Shaded boundary presents the standard deviation of the transfer curves from the FETs. b) Scatter plot of the measured  $V_{th}$  and mobility for the 946 working FETs in the array. Histograms for these measured values of the  $V_{th}$  and mobility are shown at the right and top axes, respectively, with a Gaussian fitted curves (solid red lines).



Figure S5. I - V curve of a MoS<sub>2</sub> FET at operating bias. The I - V curve of a MoS<sub>2</sub> FET in subthreshold region with  $V_G = -6$  V, where the array was operated. Yellow highlighted region indicates the operation range of the  $V_D$  (<0.3 V; in linear region), where the array was operated.



Figure S6. Photoresponse maps of single MoS<sub>2</sub> photo-FET. a)  $I_D$ - $V_D$ - $V_G$  map without and b) with the incident light (532 nm, 1.4 W/m<sup>2</sup>) c) corresponding responsivity and d) detectivity, with  $V_D$  from 0 V to 5 V and  $V_G$  from -7 to 7 V.



Figure S7. Retention characteristics of PPC. The retention curve for normalized photocurrent and its fitted curve with stretched exponential equation, showing 20% of PPC is expected after 10 years (dashed line).



Figure S8. Time-dependent response of the drain current  $I_{DS}$  through 130 cycles with each cycle consisting of the following steps: 1) application of a gate voltage  $V_G$  of 7 V to erase and reset (A to C in Part a); 2) application of a bias  $V_G = -6$  V to initialize for photo response measurement (C to E in Part a); 3) exposure to a light for a finite duration t (E to F in Part a); 4) persistent photo response observation (F to G in Part a). Each cycle has the identical experimental parameters, except the light exposure time t, which is increased by 0.2 s at every cycle. The reset (Step 1) first sends  $I_{DS}$  to the same peak value (black dashed line in Part b; Point B in Part a) with no appreciable cycle-to-cycle difference, and after hitting Point B and with  $V_G = 7$  V continued,  $I_{DS}$  droops a bit to point C (Part a) before Step 2, due to the trap of a small amount of electrons in the gate area, but the point C current values also have no appreciable cycle-to-cycle variation. The measurement bias initialization (Step 2) first drops I<sub>DS</sub> to the same minimum value (red dashed line in Part b; Point D in Part a) with no appreciable cycle-to-cycle difference, and after reaching Point D and with  $V_G = -6$  V still applied,  $I_{DS}$  rises to point E (Part a) not matter at all in the weight writing, because the absolute final value of  $I_{DS}$ is what sets the desired weight. The peak photo response (blue dashed line in Part b; Point F in Part a) and persistent photo response (green dashed line in Part b; Point G in Part a) linearly increase as the cycle is repeated. This is consistent with the increasing light exposure time from cycle to cycle.



Figure S9. Calibration procedure to obtain the image of Fig. 2. a) A photocurrent map measured at an example pixel with respect to the intensity of lights (R, G, B) and the dark period duration after the shutter is closed. b, c) Examples of calibration by translating measured photocurrent to pixel intensity at 0 s, the moment the shutter is closed, and at 60 s after the shutter is closed. The red square in the inset indicates a colored pixel calibrated. d)  $30 \times 30$  photocurrent map with 12 evenly-spaced intensity for red, green, blue colored light at time = 0 s.



Figure S10. Absorbed light-dosage-dependent magnitude of PPC. a) The photocurrent in log scale as a function of time (at  $V_D$  of 0.3 V and  $V_G$  of -6 V) after initial exposure (3.2 W/m<sup>2</sup> for 1 sec) with wavelength varying from 450 nm to 800 nm. b) The magnitude of PPC evaluated as remnant photocurrent after 30 sec in dark state as a function of the input light wavelength. c) Log scale photocurrent as a function of time (same voltages as a), with various intensity of incident light (532 nm) from 0.03 to 30 W/m<sup>2</sup>. d) Accompanying PPC magnitude as a function of time with various exposure time from 0.2 sec to 30 sec (532 nm, 3.2 W/m<sup>2</sup>) and f) the accompanying PPC magnitude (evaluated after 150 sec in dark state) as a function of exposure time.



Figure S11. Transient photocurrent curve with optically excited compound PPC. The PPC is added by two incident light excitations (532 nm,  $3.2 \text{ W/m}^2$ , 1 sec) at 0 and 60 sec.



Figure S12.  $120 \times 120$  pixel composite, greyscale images of "Cameraman". The input image is shown in the top left box. The Cameraman image is used by permission of MIT. Subsequent boxes show the captured and calibrated images at 0, 15, 30, 45, and 60 seconds after 5 sec exposure.



Figure S13. Image processing with optically encoded filters. a) The target kernels (top) and optically programmed conductance kernels (bottom) in the crossbar. b) The experimentally measured output current for all vector-matrix calculation as a function of the simulated current. c) The correlation coefficient between the experimentally processed images and simulated images.



Figure S14. Optical input images and memorized images of 10 examples. The input images and memorized images (captured after 1 min in dark state) for the 10 representative images (1 for each) with resolution of  $13 \times 13$ .



Figure S15. Convolutional layer outputs for the 10 representative images. a) 10 feature maps of size  $10 \times 10$  for the each of 10 representative images. b) Average correlation coefficients of experimentally calculated feature maps and the simulated ones categorized for each digit (10 feature maps for each of 100 images = 1000 feature maps for each digit).



Figure S16. FC output and decisions of the 10 representative images. Each example shows the memorized image (left), FC output magnitudes (line in right graph) and Bayesian probabilities (bar in right graph), obtained by software (red) and experiments (black).